GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets

Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 sin...

Full description

Saved in:

Bibliographic Details
Published in	PloS one Vol. 12; no. 7; p. e0181420
Main Authors	Jeong, Seongmun, Kim, Jae-Yoon, Jeong, Soon-Chun, Kang, Sung-Taeg, Moon, Jung-Kyung, Kim, Namshin
Format	Journal Article
Language	English
Published	United States Public Library of Science 20.07.2017 Public Library of Science (PLoS)
Subjects	Access to Information Algorithms Analysis Bioinformatics Biology and Life Sciences Biotechnology Breeding Collection Computer and Information Sciences Computer programs Cost analysis Crop science Crops Databases, Genetic Datasets Datasets as Topic Gene Frequency Gene polymorphism Genetic aspects Genetic distance Genetic diversity Genetic markers Genome-wide association studies Genomes Genomics Internet Markers Methods Oryza - genetics Phenotype Physical Sciences Picking Plant breeding Polymorphism Polymorphism, Single Nucleotide Principal Component Analysis Reproducibility of Results Research and Analysis Methods Rice Single nucleotide polymorphisms Single-nucleotide polymorphism Software Software development Soybeans Studies Triticum - genetics
Online Access	Get full text
ISSN	1932-6203 1932-6203
DOI	10.1371/journal.pone.0181420

Cover

Abstract	Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
AbstractList	Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore . Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore. Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
Audience	Academic
Author	Jeong, Soon-Chun Kim, Jae-Yoon Kim, Namshin Jeong, Seongmun Moon, Jung-Kyung Kang, Sung-Taeg
AuthorAffiliation	2 Department of Biological Sciences, KRIBB School, Korea University of Science and Technology, Daejeon, Korea 5 National Institute of Crop Science, Rural Development Administration, Jeonju, Jeonbuk, Korea UMR-S1134, INSERM, Université Paris Diderot, INTS, FRANCE 1 Personalized Genomic Medicine Research Center, Division of Strategic Research Groups, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea 3 Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk, Korea 4 Department of Crop Science and Biotechnology, Dankook University, Cheonan, Chungnam, Korea
AuthorAffiliation_xml	– name: 1 Personalized Genomic Medicine Research Center, Division of Strategic Research Groups, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea – name: UMR-S1134, INSERM, Université Paris Diderot, INTS, FRANCE – name: 3 Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk, Korea – name: 5 National Institute of Crop Science, Rural Development Administration, Jeonju, Jeonbuk, Korea – name: 2 Department of Biological Sciences, KRIBB School, Korea University of Science and Technology, Daejeon, Korea – name: 4 Department of Crop Science and Biotechnology, Dankook University, Cheonan, Chungnam, Korea
Author_xml	– sequence: 1 givenname: Seongmun orcidid: 0000-0002-0038-461X surname: Jeong fullname: Jeong, Seongmun – sequence: 2 givenname: Jae-Yoon surname: Kim fullname: Kim, Jae-Yoon – sequence: 3 givenname: Soon-Chun surname: Jeong fullname: Jeong, Soon-Chun – sequence: 4 givenname: Sung-Taeg surname: Kang fullname: Kang, Sung-Taeg – sequence: 5 givenname: Jung-Kyung surname: Moon fullname: Moon, Jung-Kyung – sequence: 6 givenname: Namshin surname: Kim fullname: Kim, Namshin
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/28727806$$D View this record in MEDLINE/PubMed
BookMark	eNqNk12L1DAUhousuB_6D0QDgujFjEnbpO1eCMOg68DCgl9XQjhN0k6GtBmTVNx_b7ozI9NlEelFw8nzvifncM55ctLbXiXJc4LnJCvIu40dXA9mvo3hOSYlyVP8KDkjVZbOWIqzk6PzaXLu_QZjmpWMPUlO07JIixKzs-THlert0jp1iRbI625rFIJeogZ8QGBa63RYd6ixDolIIT_UXgXklVEiaNujxtkOGXCtQm20CrdbhSQEiJR_mjxuwHj1bP-_SL59_PB1-Wl2fXO1Wi6uZ6KgZZgRIilhlObxoFgNsqA0S0tKZQ20JjllmRSMYaxqCaQURZ5BWhdQVylhBEN2kbzc-W6N9XzfGM9JvCcMs6yKxGpHSAsbvnW6A3fLLWh-F7Cu5eCCFkbxihWUCFIxJtK8bmKWWqQSZ0V8YVlRGb3e77MNdaekUH1wYCam05ter3lrf3FKSYULFg3e7A2c_TkoH3invVDGQK_scPfulOKcVkVEX91DH65uT7UQC9B9Y2NeMZryRV6Vo10xUvMHqPhJ1WkRp6jRMT4RvJ0IIhPU79DC4D1fffn8_-zN9yn7-ohdKzBh7a0ZxnnyU_DFcaf_tvgwvhG43AHCWe-darjQAUafWJo2nGA-7sqhaXzcFb7flSjO74kP_v-U_QGHDxbJ
CitedBy_id	crossref_primary_10_48130_abd_0025_0002 crossref_primary_10_3390_plants13050618 crossref_primary_10_3389_fpls_2023_1112297 crossref_primary_10_3389_fgene_2020_567757 crossref_primary_10_1270_jsbbs_22071 crossref_primary_10_1371_journal_pone_0255418 crossref_primary_10_1038_s41438_018_0080_8 crossref_primary_10_1186_s13007_023_01084_0 crossref_primary_10_3389_fpls_2023_1130814 crossref_primary_10_1007_s10722_021_01211_7 crossref_primary_10_1007_s00122_023_04477_w crossref_primary_10_1016_j_foreco_2022_120748 crossref_primary_10_1016_j_indcrop_2023_117657 crossref_primary_10_1002_tpg2_20447 crossref_primary_10_1093_gigascience_giz151 crossref_primary_10_1007_s10722_022_01469_5 crossref_primary_10_3389_fpls_2024_1429279 crossref_primary_10_1007_s10592_023_01581_8 crossref_primary_10_1007_s00122_024_04683_0 crossref_primary_10_1007_s11295_020_01462_y crossref_primary_10_1093_plphys_kiac006 crossref_primary_10_5808_GI_2020_18_1_e8 crossref_primary_10_1038_s41467_022_28362_0 crossref_primary_10_1016_j_sajb_2023_09_021 crossref_primary_10_1038_s41467_023_41251_4 crossref_primary_10_1016_j_jarmap_2024_100605 crossref_primary_10_17660_ActaHortic_2020_1294_12 crossref_primary_10_9787_KJBS_2021_53_3_277 crossref_primary_10_3389_fpls_2020_01040 crossref_primary_10_1089_bio_2018_0033 crossref_primary_10_3390_genes15050603 crossref_primary_10_1186_s12859_018_2209_z crossref_primary_10_1186_s43897_021_00014_9 crossref_primary_10_1371_journal_pone_0224074 crossref_primary_10_3390_plants12061305 crossref_primary_10_1093_g3journal_jkab145 crossref_primary_10_3390_agronomy11030581 crossref_primary_10_1007_s10722_022_01513_4 crossref_primary_10_3390_f13030489
Cites_doi	10.1104/pp.111.185033 10.1038/ncomms10532 10.1002/j.1538-7305.1948.tb01338.x 10.1186/1471-2105-13-219 10.1111/tpj.12755 10.1093/bioinformatics/btm313 10.1371/journal.pone.0010780 10.1093/jhered/92.1.93 10.1186/1471-2105-10-243 10.1186/1471-2105-13-312
ContentType	Journal Article
Copyright	COPYRIGHT 2017 Public Library of Science 2017 Jeong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. 2017 Jeong et al 2017 Jeong et al
Copyright_xml	– notice: COPYRIGHT 2017 Public Library of Science – notice: 2017 Jeong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. – notice: 2017 Jeong et al 2017 Jeong et al
DBID	AAYXX CITATION CGR CUY CVF ECM EIF NPM IOV ISR 3V. 7QG 7QL 7QO 7RV 7SN 7SS 7T5 7TG 7TM 7U9 7X2 7X7 7XB 88E 8AO 8C1 8FD 8FE 8FG 8FH 8FI 8FJ 8FK ABJCF ABUWG AEUYN AFKRA ARAPS ATCPS AZQEC BBNVY BENPR BGLVJ BHPHI C1K CCPQU D1I DWQXO FR3 FYUFA GHDGH GNUQQ H94 HCIFZ K9. KB. KB0 KL. L6V LK8 M0K M0S M1P M7N M7P M7S NAPCQ P5Z P62 P64 PATMY PDBOC PHGZM PHGZT PIMPY PJZUB PKEHL PPXIY PQEST PQGLB PQQKQ PQUKI PRINS PTHSS PYCSY RC3 7X8 5PM DOA
DOI	10.1371/journal.pone.0181420
DatabaseName	CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Gale In Context: Opposing Viewpoints Gale In Context: Science ProQuest Central (Corporate) Animal Behavior Abstracts Bacteriology Abstracts (Microbiology B) Biotechnology Research Abstracts Nursing & Allied Health Database Ecology Abstracts Entomology Abstracts (Full archive) Immunology Abstracts Meteorological & Geoastrophysical Abstracts Nucleic Acids Abstracts Virology and AIDS Abstracts Agricultural Science Collection Health & Medical Collection ProQuest Central (purchase pre-March 2016) Medical Database (Alumni Edition) ProQuest Pharma Collection Public Health Database Technology Research Database ProQuest SciTech Collection ProQuest Technology Collection ProQuest Natural Science Collection ProQuest Hospital Collection Hospital Premium Collection (Alumni Edition) ProQuest Central (Alumni) (purchase pre-March 2016) Materials Science & Engineering Collection ProQuest Central (Alumni) ProQuest One Sustainability ProQuest Central UK/Ireland Advanced Technologies & Aerospace Collection Agricultural & Environmental Science Collection ProQuest Central Essentials ProQuest : Biological Science Collection journals [unlimited simultaneous users] ProQuest Central Technology collection Natural Science Collection Environmental Sciences and Pollution Management ProQuest One ProQuest Materials Science Collection ProQuest Central Engineering Research Database Health Research Premium Collection Health Research Premium Collection (Alumni) ProQuest Central Student AIDS and Cancer Research Abstracts SciTech Premium Collection ProQuest Health & Medical Complete (Alumni) Materials Science Database Nursing & Allied Health Database (Alumni Edition) Meteorological & Geoastrophysical Abstracts - Academic ProQuest Engineering Collection ProQuest Biological Science Collection Agricultural Science Database ProQuest Health & Medical Collection Medical Database Algology Mycology and Protozoology Abstracts (Microbiology C) Biological Science Database Engineering Database Nursing & Allied Health Premium Advanced Technologies & Aerospace Database ProQuest Advanced Technologies & Aerospace Collection Biotechnology and BioEngineering Abstracts Environmental Science Database Materials Science Collection ProQuest Central Premium ProQuest One Academic (New) Publicly Available Content Database ProQuest Health & Medical Research Collection ProQuest One Academic Middle East (New) ProQuest One Health & Nursing ProQuest One Academic Eastern Edition (DO NOT USE) ProQuest One Applied & Life Sciences ProQuest One Academic ProQuest One Academic UKI Edition ProQuest Central China Engineering collection Environmental Science Collection Genetics Abstracts MEDLINE - Academic PubMed Central (Full Participant titles) DOAJ Directory of Open Access Journals
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Agricultural Science Database Publicly Available Content Database ProQuest Central Student ProQuest Advanced Technologies & Aerospace Collection ProQuest Central Essentials Nucleic Acids Abstracts SciTech Premium Collection ProQuest Central China Environmental Sciences and Pollution Management ProQuest One Applied & Life Sciences ProQuest One Sustainability Health Research Premium Collection Meteorological & Geoastrophysical Abstracts Natural Science Collection Health & Medical Research Collection Biological Science Collection ProQuest Central (New) ProQuest Medical Library (Alumni) Engineering Collection Advanced Technologies & Aerospace Collection Engineering Database Virology and AIDS Abstracts ProQuest Biological Science Collection ProQuest One Academic Eastern Edition Agricultural Science Collection ProQuest Hospital Collection ProQuest Technology Collection Health Research Premium Collection (Alumni) Biological Science Database Ecology Abstracts ProQuest Hospital Collection (Alumni) Biotechnology and BioEngineering Abstracts Environmental Science Collection Entomology Abstracts Nursing & Allied Health Premium ProQuest Health & Medical Complete ProQuest One Academic UKI Edition Environmental Science Database ProQuest Nursing & Allied Health Source (Alumni) Engineering Research Database ProQuest One Academic Meteorological & Geoastrophysical Abstracts - Academic ProQuest One Academic (New) Technology Collection Technology Research Database ProQuest One Academic Middle East (New) Materials Science Collection ProQuest Health & Medical Complete (Alumni) ProQuest Central (Alumni Edition) ProQuest One Community College ProQuest One Health & Nursing ProQuest Natural Science Collection ProQuest Pharma Collection ProQuest Central ProQuest Health & Medical Research Collection Genetics Abstracts ProQuest Engineering Collection Biotechnology Research Abstracts Health and Medicine Complete (Alumni Edition) ProQuest Central Korea Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) Agricultural & Environmental Science Collection AIDS and Cancer Research Abstracts Materials Science Database ProQuest Materials Science Collection ProQuest Public Health ProQuest Nursing & Allied Health Source ProQuest SciTech Collection Advanced Technologies & Aerospace Database ProQuest Medical Library Animal Behavior Abstracts Materials Science & Engineering Collection Immunology Abstracts ProQuest Central (Alumni) MEDLINE - Academic
DatabaseTitleList	MEDLINE MEDLINE - Academic Agricultural Science Database
Database_xml	– sequence: 1 dbid: DOA name: DOAJ Directory of Open Access Journals url: https://www.doaj.org/ sourceTypes: Open Website – sequence: 2 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 3 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 4 dbid: 8FG name: ProQuest Technology Collection url: https://search.proquest.com/technologycollection1 sourceTypes: Aggregation Database
DeliveryMethod	fulltext_linktorsrc
Discipline	Sciences (General)
DocumentTitleAlternate	GenoCore
EISSN	1932-6203
ExternalDocumentID	1921160639 oai_doaj_org_article_96751c1966c24bfab9bc2d037655895d PMC5519076 A498922579 28727806 10_1371_journal_pone_0181420
Genre	Journal Article
GrantInformation_xml	– fundername: ; grantid: initiative program – fundername: ; grantid: NRF-2011-0030049 – fundername: ; grantid: PJ011929 – fundername: ; grantid: NRF-2014M3C9A3064552
GroupedDBID	--- 123 29O 2WC 53G 5VS 7RV 7X2 7X7 7XC 88E 8AO 8C1 8CJ 8FE 8FG 8FH 8FI 8FJ A8Z AAFWJ AAUCC AAWOE AAYXX ABDBF ABIVO ABJCF ABUWG ACGFO ACIHN ACIWK ACPRK ACUHS ADBBV ADRAZ AEAQA AENEX AEUYN AFKRA AFPKN AFRAH AHMBA ALIPV ALMA_UNASSIGNED_HOLDINGS AOIJS APEBS ARAPS ATCPS BAWUL BBNVY BCNDV BENPR BGLVJ BHPHI BKEYQ BPHCQ BVXVI BWKFM CCPQU CITATION CS3 D1I D1J D1K DIK DU5 E3Z EAP EAS EBD EMOBN ESX EX3 F5P FPL FYUFA GROUPED_DOAJ GX1 HCIFZ HH5 HMCUK HYE IAO IEA IGS IHR IHW INH INR IOV IPY ISE ISR ITC K6- KB. KQ8 L6V LK5 LK8 M0K M1P M48 M7P M7R M7S M~E NAPCQ O5R O5S OK1 OVT P2P P62 PATMY PDBOC PHGZM PHGZT PIMPY PQQKQ PROAC PSQYO PTHSS PV9 PYCSY RNS RPM RZL SV3 TR2 UKHRP WOQ WOW ~02 ~KM 3V. BBORY CGR CUY CVF ECM EIF IPNFZ NPM RIG PMFND 7QG 7QL 7QO 7SN 7SS 7T5 7TG 7TM 7U9 7XB 8FD 8FK AZQEC C1K DWQXO FR3 GNUQQ H94 K9. KL. M7N P64 PJZUB PKEHL PPXIY PQEST PQGLB PQUKI PRINS RC3 7X8 ESTFP PUEGO 5PM - 02 AAPBV ABPTK BBAFP KM
ID	FETCH-LOGICAL-c758t-11d51655411de6bad75532855dba5b14563dc6600ebda18c743a2b7ab921610a3
IEDL.DBID	M48
ISSN	1932-6203
IngestDate	Sun Sep 04 00:10:42 EDT 2022 Wed Aug 27 01:25:33 EDT 2025 Thu Aug 21 14:09:13 EDT 2025 Mon Sep 08 12:15:04 EDT 2025 Fri Jul 25 10:19:49 EDT 2025 Tue Jun 17 20:26:52 EDT 2025 Tue Jun 10 20:31:33 EDT 2025 Fri Jun 27 04:07:23 EDT 2025 Fri Jun 27 04:45:25 EDT 2025 Thu May 22 21:22:15 EDT 2025 Wed Feb 19 02:01:05 EST 2025 Tue Jul 01 00:51:51 EDT 2025 Thu Apr 24 23:00:43 EDT 2025
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	7
Language	English
License	This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Creative Commons Attribution License
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c758t-11d51655411de6bad75532855dba5b14563dc6600ebda18c743a2b7ab921610a3
Notes	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 Competing Interests: The authors have declared that no competing interests exist.
ORCID	0000-0002-0038-461X
OpenAccessLink	http://journals.scholarsportal.info/openUrl.xqy?doi=10.1371/journal.pone.0181420
PMID	28727806
PQID	1921160639
PQPubID	1436336
PageCount	e0181420
ParticipantIDs	plos_journals_1921160639 doaj_primary_oai_doaj_org_article_96751c1966c24bfab9bc2d037655895d pubmedcentral_primary_oai_pubmedcentral_nih_gov_5519076 proquest_miscellaneous_1922504597 proquest_journals_1921160639 gale_infotracmisc_A498922579 gale_infotracacademiconefile_A498922579 gale_incontextgauss_ISR_A498922579 gale_incontextgauss_IOV_A498922579 gale_healthsolutions_A498922579 pubmed_primary_28727806 crossref_citationtrail_10_1371_journal_pone_0181420 crossref_primary_10_1371_journal_pone_0181420
ProviderPackageCode	CITATION AAYXX
PublicationCentury	2000
PublicationDate	2017-07-20
PublicationDateYYYYMMDD	2017-07-20
PublicationDate_xml	– month: 07 year: 2017 text: 2017-07-20 day: 20
PublicationDecade	2010
PublicationPlace	United States
PublicationPlace_xml	– name: United States – name: San Francisco – name: San Francisco, CA USA
PublicationTitle	PloS one
PublicationTitleAlternate	PLoS One
PublicationYear	2017
Publisher	Public Library of Science Public Library of Science (PLoS)
Publisher_xml	– name: Public Library of Science – name: Public Library of Science (PLoS)
References	JP Cook (ref7) 2012; 158 YG Lee (ref8) 2015; 81 OH Frankel (ref1) 1984 C Thachuk (ref4) 2009; 10 B Gouesnard (ref2) 2001; 92 K Zhao (ref6) 2010; 5 SR McCouch (ref9) 2016; 7 PA Wilkinson (ref10) 2012; 13 HD Beukelaer (ref5) 2012; 13 S Wright (ref11) 1978 CE Shannon (ref12) 1948; 27 KW Kim (ref3) 2007; 23 26842267 - Nat Commun. 2016 Feb 04;7:10532 11336240 - J Hered. 2001 Jan-Feb;92(1):93-4 25641104 - Plant J. 2015 Feb;81(4):625-36 22135431 - Plant Physiol. 2012 Feb;158(2):824-34 23174036 - BMC Bioinformatics. 2012 Nov 23;13:312 22943283 - BMC Bioinformatics. 2012 Sep 03;13:219 17586551 - Bioinformatics. 2007 Aug 15;23(16):2155-62 20520727 - PLoS One. 2010 May 24;5(5):e10780 19660135 - BMC Bioinformatics. 2009 Aug 06;10:243
References_xml	– volume: 158 start-page: 824 year: 2012 ident: ref7 article-title: Genetic architecture of maize kernel composition in the nested association mapping and inbred association panels publication-title: Plant Physiology doi: 10.1104/pp.111.185033 – volume: 7 start-page: 10532 year: 2016 ident: ref9 article-title: Open access resources for genome-wide association mapping in rice publication-title: Nature communications doi: 10.1038/ncomms10532 – volume: 27 start-page: 379 year: 1948 ident: ref12 article-title: A mathematical theory of communication publication-title: Bell System Technical Journal doi: 10.1002/j.1538-7305.1948.tb01338.x – volume: 13 start-page: 219 year: 2012 ident: ref10 article-title: CerealsDB 2.0: an integrated resource for plant breeders and scientists publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-13-219 – volume: 81 start-page: 625 year: 2015 ident: ref8 article-title: Development, validation and genetic analysis of a large soybean SNP genotyping array publication-title: Plant Journal doi: 10.1111/tpj.12755 – year: 1978 ident: ref11 article-title: A treatise in four volumes, Volume IV: Variability within and among natural populations – start-page: 249 year: 1984 ident: ref1 article-title: Crop Genetic Resources: Conservation and Evaluation – volume: 23 start-page: 2155 year: 2007 ident: ref3 article-title: PowerCore: a program applying the advanced M strategy with a heuristic search for establishing core sets publication-title: Bioinformatics doi: 10.1093/bioinformatics/btm313 – volume: 5 start-page: e10780 year: 2010 ident: ref6 article-title: Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome publication-title: PLoS One doi: 10.1371/journal.pone.0010780 – volume: 92 start-page: 93 year: 2001 ident: ref2 article-title: MSTRAT: An algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness publication-title: Journal of Heredity doi: 10.1093/jhered/92.1.93 – volume: 10 start-page: 243 year: 2009 ident: ref4 article-title: Core Hunter: an algorithm for sampling genetic resources based on multiple genetic measures publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-10-243 – volume: 13 start-page: 312 year: 2012 ident: ref5 article-title: Core Hunter II: fast core subset selection based on multiple genetic diversity measures using Mixed Replica search publication-title: BMC Bioinformatics doi: 10.1186/1471-2105-13-312 – reference: 22135431 - Plant Physiol. 2012 Feb;158(2):824-34 – reference: 23174036 - BMC Bioinformatics. 2012 Nov 23;13:312 – reference: 25641104 - Plant J. 2015 Feb;81(4):625-36 – reference: 26842267 - Nat Commun. 2016 Feb 04;7:10532 – reference: 11336240 - J Hered. 2001 Jan-Feb;92(1):93-4 – reference: 19660135 - BMC Bioinformatics. 2009 Aug 06;10:243 – reference: 22943283 - BMC Bioinformatics. 2012 Sep 03;13:219 – reference: 20520727 - PLoS One. 2010 May 24;5(5):e10780 – reference: 17586551 - Bioinformatics. 2007 Aug 15;23(16):2155-62
SSID	ssj0053866
Score	2.4245613
Snippet	Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide...
SourceID	plos doaj pubmedcentral proquest gale pubmed crossref
SourceType	Open Website Open Access Repository Aggregation Database Index Database Enrichment Source
StartPage	e0181420
SubjectTerms	Access to Information Algorithms Analysis Bioinformatics Biology and Life Sciences Biotechnology Breeding Collection Computer and Information Sciences Computer programs Cost analysis Crop science Crops Databases, Genetic Datasets Datasets as Topic Gene Frequency Gene polymorphism Genetic aspects Genetic distance Genetic diversity Genetic markers Genome-wide association studies Genomes Genomics Internet Markers Methods Oryza - genetics Phenotype Physical Sciences Picking Plant breeding Polymorphism Polymorphism, Single Nucleotide Principal Component Analysis Reproducibility of Results Research and Analysis Methods Rice Single nucleotide polymorphisms Single-nucleotide polymorphism Software Software development Soybeans Studies Triticum - genetics
SummonAdditionalLinks	– databaseName: DOAJ Directory of Open Access Journals dbid: DOA link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1bb9MwFLZQn3hBjNsCYxiEBDxki3OxY97KxDSQAAkY2gNS5Fu2SiWp6vT_75zEjRo0aTzwFtUnVfOdiz-nx58JeZ1LC2XP2dgKXcc50zyGaVjENbMuVbyGOMAX-l--8rPz_PNFcbFz1Bf2hA3ywANwxxIYLTMQJ9ykua6VltqkNoG8KIpSFharbyKT7WJqqMGQxZyHjXKZYMfBL0ertnFHKFGV4_neOxNRr9c_VuXZatn6myjn352TO1PR6X1yL3BIOh9--x6545oHZC9kqadvg5T0u4fkN1y2J-3avadz6heoBExVY2mtfEfV8rJdL7qrPxSIK0U5S-qhjLiO-v5wHPAYxd0ndInd4hTVXPGFLcWmUrDyj8j56cefJ2dxOE8hNrAq6GLGbMEAthwuHNfKiqLI0rIorFaFZkClMms4MCCnrWKlAXKhUi0A8hR4YaKyx2TWAIL7hBplcqOAqxgH06AtlYNbSyGTTCltTRKRbAtuZYLYOJ55saz6f9AELDoGrCp0SRVcEpF4vGs1iG3cYv8B_TbaolR2_wEEUBUCqLotgCLyAr1eDftOx4Sv5rksJVQ7ISPyqrdAuYwG-3Eu1cb76tO3X_9g9OP7xOhNMKpbgANQHPZAwDOhDNfE8mBiCUlvJsP7GKNbVHyFsnaMI9-EO7dxe_Pwy3EYvxR77BrXbnob1LODFWZEngxhPiIL6-pUlAmPiJgkwAT66UizuOrVyoGSy0Twp__DV8_I3RRpVSKguh-QWbfeuOdACjt92Of_NVJbX4s priority: 102 providerName: Directory of Open Access Journals – databaseName: Health & Medical Collection dbid: 7X7 link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELZguXBBlFcDBQxCAg5p87QTLmipqAoSIAFFPSBZfmVbaUmWdfb_M-N4Q4Mq4BbF49V6xjP-xhl_JuRZURsIe9bEhqsmLlLFYliGedykxmaSNTAPcEP_w0d2fFK8Py1Pw4abC2WV25joA7XpNO6RHyBvV8pwQX29-hnjrVH4dTVcoXGVXPPUZTCf-emYcIEvMxaOy-U8PQjW2V91rd1HoqoCb_m-sBx51v4xNs9Wy85dBjz_rJ-8sCAd3SQ3ApKk88H0O-SKbW-RneCrjr4IhNIvb5Pv8Ngddmv7is6pO0c-YCpbQxvpeiqXCxhmf_aDAnylSGpJHQQT21Pnr8gBu1E8g0KXWDNOkdMVt20plpaClLtDTo7efj08jsOtCrGG3KCP09SUKQMUAQ-WKWl4WeZZVZZGyVKlAKhyoxngIKuMTCsNEENmiksFBgCsJfO7ZNaCBncJ1VIXWgJi0RYWQ1NJC10rXie5lMroJCL5VrlCB8pxvPliKfx3NA6px6ArgSYRwSQRicdeq4Fy4x_yb9BuoywSZvsX3Xohgv-JGhKjVEO4YTorVAOjUTozCYTXsqzq0kTkMVpdDKdPR7cX86Kuaoh5vI7IUy-BpBktVuUs5MY58e7Tt_8Q-vJ5IvQ8CDUdqAO0OJyEgDEhGddEcm8iCa6vJ827OEe3WnHit5NAz-28vbz5ydiMP4qVdq3tNl4GWe0gz4zIvWGaj5qF7DrjVcIiwicOMFH9tKU9P_Oc5QDM64Sz-3__Ww_I9QxhU8Iheu-RWb_e2IcA-nr1yHv2L_kGV40 priority: 102 providerName: ProQuest
Title	GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets
URI	https://www.ncbi.nlm.nih.gov/pubmed/28727806 https://www.proquest.com/docview/1921160639 https://www.proquest.com/docview/1922504597 https://pubmed.ncbi.nlm.nih.gov/PMC5519076 https://doaj.org/article/96751c1966c24bfab9bc2d037655895d http://dx.doi.org/10.1371/journal.pone.0181420
Volume	12
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9NAEF_O3osv4vl10bOuIqgPKUma7CaCSK9cPYU75bTSByHsV3qFXtLrpqD_vTNJGoxU9MGXELKzgczszv5mM_sbQp6HiQa3Z7Sruczc0JfMhWWYu5mvTSBYBuMAN_TPztnpNPwwi2Z7ZFuztVGg3RnaYT2p6Xo5-H794y1M-DdV1QbubzsNVkVuBkhAFQYQxO_D2sQwHDsL2_8KMLsZaw7Q_alnZ4GqePxbb91bLQu7C4r-nlH5yxI1uU1uNdiSjurBcED2TH6HHDSz19KXDcX0q7vkG9wW42JtXtMRtQtkCKYi1zQTtqRiOS_Wi_LyigKgpUhzSS24F1NSWxXNAUtSPJVCl5hFTpHlFTdyKSabgpS9R6aTky_jU7eps-AqiBZK1_d15DPAFXBjmBSaR9EwiKNISxFJHyDWUCsGyMhILfxYAegQgeRCJgHgRU8M75NeDho8JFQJFSoBGEYZWB51LAx0jXniDYWQWnkOGW6Vm6qGhBxrYSzT6s8ah2Ck1lWKJkkbkzjEbXutahKOv8gfo91aWaTQrh4U63nazMg0gVDJV-CAmApCmcHXSBVoDxxuFMVJpB3yBK2e1udRW0eQjsIkTsAL8sQhzyoJpNHIMU9nLjbWpu8_fv0Hoc8XHaEXjVBWgDpAi_XZCPgmpOfqSB51JMEZqE7zIY7RrVZsinR3PkMcCj2343Z389O2GV-KuXe5KTaVDPLcQeTpkAf1MG81C_F2wGOPOYR3JkBH9d2WfHFZsZgDVE88zh7-D1s9IjcDhFseB69_RHrlemMeA1gsZZ_c4DMO13js43Xyrk_2j0_OP130q-2XfuUffgIoYG9L
linkProvider	Scholars Portal
linkToHtml	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELbKcoALoryatlCDQMAhbZ52goTQslDt0gcStKgHpOBXtpWWZNlkhfhT_EZm8qJBFXDpLYrHUTIefzN2xt8Q8jiINcCe0bbmMrUDVzIb3DC3U1cbT7AU7AA39A8O2fg4eHcSnqyQn-1ZGEyrbDGxAmqdK9wj30HeLpehQ301_2Zj1Sj8u9qW0KjNYs_8-A5LtuLl5A2M7xPP2317NBrbTVUBW0FsXNquq0OXgReFC8Ok0DwMfS8KQy1FKF0IKHytGMQBRmrhRgpcrPAkFxJeAGIN4cNzr5Crge8znEXRqEspAexgrDme53N3p7GG7XmemW0kxgqwqvg591dVCeh8wWA-y4uLAt0_8zXPOcDdm-RGE7nSYW1qq2TFZLfIaoMNBX3WEFg_v00-w2U-yhfmBR3S4gz5h6nINE1FUVIxm4Jay9OvFMJliiSatADwMiUtqpI8YCcUz7zQGeaoU-SQxW1iiqmsIFXcIceXou-7ZJCBBtcIVUIFSkCEpAw4Xx0JA10jHju-EFIrxyJ-q9xENRTnWGljllT_7TgsdWpdJTgkSTMkFrG7XvOa4uMf8q9x3DpZJOiubuSLadLM9ySGhZirAN6Y8gKZwtdI5WkH4DwMozjUFtnCUU_q064dzCTDII5iwFgeW-RRJYEkHRlmAU3FsiiSyftP_yH08UNP6GkjlOagDtBiffICvgnJv3qSmz1JgBrVa15DG221UiS_JyX0bO324uaHXTM-FDP7MpMvKxlk0YN1rUXu1WbeaRZW8x6PHGYR3psAPdX3W7Kz04ojHRYCscPZ-t9fa4tcGx8d7Cf7k8O9DXLdw5DN4eA5NsmgXCzNfQg4S_mgmuWUfLlsWPkF5GqTBw
linkToPdf	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELbKVkJcEOXVQKEGgYBDunk7QarQ9rFqKSxVS1EPSMGvbCstybLJCvEX-VXMJM7SoAq49BbF4ygZj78ZO-NvCHkWJApgTytbMZHZgSsiG9wwszNXaY9HGdgBbui_H0V7J8Hb0_B0ifxsz8JgWmWLiTVQq0LiHnkfebvcCB1qPzNpEYc7wzfTbzZWkMI_rW05DW7KLKjNmm7MHPI40D--w3Ku3NzfgbF_7nnD3Y_be7apOGBLiJsr23VV6EbgYeFCR4IrFoa-F4ehEjwULgQbvpIRxAhaKO7GEtwv9wTjAl4O4hDuw3OvkWUGXjLokeWt3dHhUesXAFmiyBze85nbN7ayMS1yvYG0WQHWHL_gHOsaAgtP0ZtOivKyMPjPbM4L7nF4i9w0cS0dNIa4QpZ0fpusGOQo6UtDb_3qDvkMl8V2MdOv6YCW58hOTHmuaMbLivLJGBRdnX2lEExTpNikJUCbrmhZF-wBK6J4IoZOMIOdIsMsbiJTTHQFqfIuObkSjd8jvRw0uEqo5DKQHOInqcE1q5hr6BqzxPE5F0o6FvFb5abSEKBjHY5JWv_VY7AQanSV4pCkZkgsYi96TRsCkH_Ib-G4LWSRvru-UczGqUGDNIFlmisB_CLpBSKDrxHSUw6AfRjGSagsso6jnjZnYRcglA6CJE4AgVlikae1BFJ45DgZxnxelun-h0__IXR81BF6YYSyAtQBWmzOZcA3ITVYR3KtIwlAJDvNq2ijrVbK9PeUhZ6t3V7e_GTRjA_FvL9cF_NaBjn2YNVrkfuNmS80C2t9j8VOZBHWmQAd1Xdb8vOzmkEdcCFxWPTg76-1Tq4DxKTv9kcHD8kND-M5h4FbWSO9ajbXjyAarcRjM80p-XLVyPILj5id_w
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GenoCore%3A+A+simple+and+fast+algorithm+for+core+subset+selection+from+large+genotype+datasets&rft.jtitle=PloS+one&rft.au=Seongmun+Jeong&rft.au=Jae-Yoon+Kim&rft.au=Soon-Chun+Jeong&rft.au=Sung-Taeg+Kang&rft.date=2017-07-20&rft.pub=Public+Library+of+Science+%28PLoS%29&rft.eissn=1932-6203&rft.volume=12&rft.issue=7&rft.spage=e0181420&rft_id=info:doi/10.1371%2Fjournal.pone.0181420&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_96751c1966c24bfab9bc2d037655895d
thumbnail_l	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon
thumbnail_m	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon
thumbnail_s	http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon