GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets

Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 sin...

Full description

Saved in:
Bibliographic Details
Published inPloS one Vol. 12; no. 7; p. e0181420
Main Authors Jeong, Seongmun, Kim, Jae-Yoon, Jeong, Soon-Chun, Kang, Sung-Taeg, Moon, Jung-Kyung, Kim, Namshin
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 20.07.2017
Public Library of Science (PLoS)
Subjects
Online AccessGet full text
ISSN1932-6203
1932-6203
DOI10.1371/journal.pone.0181420

Cover

Abstract Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
AbstractList Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore .
Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at
Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide association studies (GWAS), and genomics-assisted breeding of crop species, etc. Recently, a large number of genetic markers (>100,000 single nucleotide polymorphisms) have been identified from high-density single nucleotide polymorphism (SNP) arrays and next-generation sequencing (NGS) data. However, there is no software available for picking out the efficient and consistent core subset from such a huge dataset. It is necessary to develop software that can extract genetically important samples in a population with coherence. We here present a new program, GenoCore, which can find quickly and efficiently the core subset representing the entire population. We introduce simple measures of coverage and diversity scores, which reflect genotype errors and genetic variations, and can help to select a sample rapidly and accurately for crop genotype dataset. Comparison of our method to other core collection software using example datasets are performed to validate the performance according to genetic distance, diversity, coverage, required system resources, and the number of selected samples. GenoCore selects the smallest, most consistent, and most representative core collection from all samples, using less memory with more efficient scores, and shows greater genetic coverage compared to the other software tested. GenoCore was written in R language, and can be accessed online with an example dataset and test results at https://github.com/lovemun/Genocore.
Audience Academic
Author Jeong, Soon-Chun
Kim, Jae-Yoon
Kim, Namshin
Jeong, Seongmun
Moon, Jung-Kyung
Kang, Sung-Taeg
AuthorAffiliation 2 Department of Biological Sciences, KRIBB School, Korea University of Science and Technology, Daejeon, Korea
5 National Institute of Crop Science, Rural Development Administration, Jeonju, Jeonbuk, Korea
UMR-S1134, INSERM, Université Paris Diderot, INTS, FRANCE
1 Personalized Genomic Medicine Research Center, Division of Strategic Research Groups, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
3 Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk, Korea
4 Department of Crop Science and Biotechnology, Dankook University, Cheonan, Chungnam, Korea
AuthorAffiliation_xml – name: 1 Personalized Genomic Medicine Research Center, Division of Strategic Research Groups, Korea Research Institute of Bioscience and Biotechnology, Daejeon, Korea
– name: UMR-S1134, INSERM, Université Paris Diderot, INTS, FRANCE
– name: 3 Bio-Evaluation Center, Korea Research Institute of Bioscience and Biotechnology, Cheongju, Chungbuk, Korea
– name: 5 National Institute of Crop Science, Rural Development Administration, Jeonju, Jeonbuk, Korea
– name: 2 Department of Biological Sciences, KRIBB School, Korea University of Science and Technology, Daejeon, Korea
– name: 4 Department of Crop Science and Biotechnology, Dankook University, Cheonan, Chungnam, Korea
Author_xml – sequence: 1
  givenname: Seongmun
  orcidid: 0000-0002-0038-461X
  surname: Jeong
  fullname: Jeong, Seongmun
– sequence: 2
  givenname: Jae-Yoon
  surname: Kim
  fullname: Kim, Jae-Yoon
– sequence: 3
  givenname: Soon-Chun
  surname: Jeong
  fullname: Jeong, Soon-Chun
– sequence: 4
  givenname: Sung-Taeg
  surname: Kang
  fullname: Kang, Sung-Taeg
– sequence: 5
  givenname: Jung-Kyung
  surname: Moon
  fullname: Moon, Jung-Kyung
– sequence: 6
  givenname: Namshin
  surname: Kim
  fullname: Kim, Namshin
BackLink https://www.ncbi.nlm.nih.gov/pubmed/28727806$$D View this record in MEDLINE/PubMed
BookMark eNqNk12L1DAUhousuB_6D0QDgujFjEnbpO1eCMOg68DCgl9XQjhN0k6GtBmTVNx_b7ozI9NlEelFw8nzvifncM55ctLbXiXJc4LnJCvIu40dXA9mvo3hOSYlyVP8KDkjVZbOWIqzk6PzaXLu_QZjmpWMPUlO07JIixKzs-THlert0jp1iRbI625rFIJeogZ8QGBa63RYd6ixDolIIT_UXgXklVEiaNujxtkOGXCtQm20CrdbhSQEiJR_mjxuwHj1bP-_SL59_PB1-Wl2fXO1Wi6uZ6KgZZgRIilhlObxoFgNsqA0S0tKZQ20JjllmRSMYaxqCaQURZ5BWhdQVylhBEN2kbzc-W6N9XzfGM9JvCcMs6yKxGpHSAsbvnW6A3fLLWh-F7Cu5eCCFkbxihWUCFIxJtK8bmKWWqQSZ0V8YVlRGb3e77MNdaekUH1wYCam05ter3lrf3FKSYULFg3e7A2c_TkoH3invVDGQK_scPfulOKcVkVEX91DH65uT7UQC9B9Y2NeMZryRV6Vo10xUvMHqPhJ1WkRp6jRMT4RvJ0IIhPU79DC4D1fffn8_-zN9yn7-ohdKzBh7a0ZxnnyU_DFcaf_tvgwvhG43AHCWe-darjQAUafWJo2nGA-7sqhaXzcFb7flSjO74kP_v-U_QGHDxbJ
CitedBy_id crossref_primary_10_48130_abd_0025_0002
crossref_primary_10_3390_plants13050618
crossref_primary_10_3389_fpls_2023_1112297
crossref_primary_10_3389_fgene_2020_567757
crossref_primary_10_1270_jsbbs_22071
crossref_primary_10_1371_journal_pone_0255418
crossref_primary_10_1038_s41438_018_0080_8
crossref_primary_10_1186_s13007_023_01084_0
crossref_primary_10_3389_fpls_2023_1130814
crossref_primary_10_1007_s10722_021_01211_7
crossref_primary_10_1007_s00122_023_04477_w
crossref_primary_10_1016_j_foreco_2022_120748
crossref_primary_10_1016_j_indcrop_2023_117657
crossref_primary_10_1002_tpg2_20447
crossref_primary_10_1093_gigascience_giz151
crossref_primary_10_1007_s10722_022_01469_5
crossref_primary_10_3389_fpls_2024_1429279
crossref_primary_10_1007_s10592_023_01581_8
crossref_primary_10_1007_s00122_024_04683_0
crossref_primary_10_1007_s11295_020_01462_y
crossref_primary_10_1093_plphys_kiac006
crossref_primary_10_5808_GI_2020_18_1_e8
crossref_primary_10_1038_s41467_022_28362_0
crossref_primary_10_1016_j_sajb_2023_09_021
crossref_primary_10_1038_s41467_023_41251_4
crossref_primary_10_1016_j_jarmap_2024_100605
crossref_primary_10_17660_ActaHortic_2020_1294_12
crossref_primary_10_9787_KJBS_2021_53_3_277
crossref_primary_10_3389_fpls_2020_01040
crossref_primary_10_1089_bio_2018_0033
crossref_primary_10_3390_genes15050603
crossref_primary_10_1186_s12859_018_2209_z
crossref_primary_10_1186_s43897_021_00014_9
crossref_primary_10_1371_journal_pone_0224074
crossref_primary_10_3390_plants12061305
crossref_primary_10_1093_g3journal_jkab145
crossref_primary_10_3390_agronomy11030581
crossref_primary_10_1007_s10722_022_01513_4
crossref_primary_10_3390_f13030489
Cites_doi 10.1104/pp.111.185033
10.1038/ncomms10532
10.1002/j.1538-7305.1948.tb01338.x
10.1186/1471-2105-13-219
10.1111/tpj.12755
10.1093/bioinformatics/btm313
10.1371/journal.pone.0010780
10.1093/jhered/92.1.93
10.1186/1471-2105-10-243
10.1186/1471-2105-13-312
ContentType Journal Article
Copyright COPYRIGHT 2017 Public Library of Science
2017 Jeong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
2017 Jeong et al 2017 Jeong et al
Copyright_xml – notice: COPYRIGHT 2017 Public Library of Science
– notice: 2017 Jeong et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
– notice: 2017 Jeong et al 2017 Jeong et al
DBID AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
IOV
ISR
3V.
7QG
7QL
7QO
7RV
7SN
7SS
7T5
7TG
7TM
7U9
7X2
7X7
7XB
88E
8AO
8C1
8FD
8FE
8FG
8FH
8FI
8FJ
8FK
ABJCF
ABUWG
AEUYN
AFKRA
ARAPS
ATCPS
AZQEC
BBNVY
BENPR
BGLVJ
BHPHI
C1K
CCPQU
D1I
DWQXO
FR3
FYUFA
GHDGH
GNUQQ
H94
HCIFZ
K9.
KB.
KB0
KL.
L6V
LK8
M0K
M0S
M1P
M7N
M7P
M7S
NAPCQ
P5Z
P62
P64
PATMY
PDBOC
PHGZM
PHGZT
PIMPY
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQQKQ
PQUKI
PRINS
PTHSS
PYCSY
RC3
7X8
5PM
DOA
DOI 10.1371/journal.pone.0181420
DatabaseName CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Gale In Context: Opposing Viewpoints
Gale In Context: Science
ProQuest Central (Corporate)
Animal Behavior Abstracts
Bacteriology Abstracts (Microbiology B)
Biotechnology Research Abstracts
Nursing & Allied Health Database
Ecology Abstracts
Entomology Abstracts (Full archive)
Immunology Abstracts
Meteorological & Geoastrophysical Abstracts
Nucleic Acids Abstracts
Virology and AIDS Abstracts
Agricultural Science Collection
Health & Medical Collection
ProQuest Central (purchase pre-March 2016)
Medical Database (Alumni Edition)
ProQuest Pharma Collection
Public Health Database
Technology Research Database
ProQuest SciTech Collection
ProQuest Technology Collection
ProQuest Natural Science Collection
ProQuest Hospital Collection
Hospital Premium Collection (Alumni Edition)
ProQuest Central (Alumni) (purchase pre-March 2016)
Materials Science & Engineering Collection
ProQuest Central (Alumni)
ProQuest One Sustainability
ProQuest Central UK/Ireland
Advanced Technologies & Aerospace Collection
Agricultural & Environmental Science Collection
ProQuest Central Essentials
ProQuest : Biological Science Collection journals [unlimited simultaneous users]
ProQuest Central
Technology collection
Natural Science Collection
Environmental Sciences and Pollution Management
ProQuest One
ProQuest Materials Science Collection
ProQuest Central
Engineering Research Database
Health Research Premium Collection
Health Research Premium Collection (Alumni)
ProQuest Central Student
AIDS and Cancer Research Abstracts
SciTech Premium Collection
ProQuest Health & Medical Complete (Alumni)
Materials Science Database
Nursing & Allied Health Database (Alumni Edition)
Meteorological & Geoastrophysical Abstracts - Academic
ProQuest Engineering Collection
ProQuest Biological Science Collection
Agricultural Science Database
ProQuest Health & Medical Collection
Medical Database
Algology Mycology and Protozoology Abstracts (Microbiology C)
Biological Science Database
Engineering Database
Nursing & Allied Health Premium
Advanced Technologies & Aerospace Database
ProQuest Advanced Technologies & Aerospace Collection
Biotechnology and BioEngineering Abstracts
Environmental Science Database
Materials Science Collection
ProQuest Central Premium
ProQuest One Academic (New)
Publicly Available Content Database
ProQuest Health & Medical Research Collection
ProQuest One Academic Middle East (New)
ProQuest One Health & Nursing
ProQuest One Academic Eastern Edition (DO NOT USE)
ProQuest One Applied & Life Sciences
ProQuest One Academic
ProQuest One Academic UKI Edition
ProQuest Central China
Engineering collection
Environmental Science Collection
Genetics Abstracts
MEDLINE - Academic
PubMed Central (Full Participant titles)
DOAJ Directory of Open Access Journals
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Agricultural Science Database
Publicly Available Content Database
ProQuest Central Student
ProQuest Advanced Technologies & Aerospace Collection
ProQuest Central Essentials
Nucleic Acids Abstracts
SciTech Premium Collection
ProQuest Central China
Environmental Sciences and Pollution Management
ProQuest One Applied & Life Sciences
ProQuest One Sustainability
Health Research Premium Collection
Meteorological & Geoastrophysical Abstracts
Natural Science Collection
Health & Medical Research Collection
Biological Science Collection
ProQuest Central (New)
ProQuest Medical Library (Alumni)
Engineering Collection
Advanced Technologies & Aerospace Collection
Engineering Database
Virology and AIDS Abstracts
ProQuest Biological Science Collection
ProQuest One Academic Eastern Edition
Agricultural Science Collection
ProQuest Hospital Collection
ProQuest Technology Collection
Health Research Premium Collection (Alumni)
Biological Science Database
Ecology Abstracts
ProQuest Hospital Collection (Alumni)
Biotechnology and BioEngineering Abstracts
Environmental Science Collection
Entomology Abstracts
Nursing & Allied Health Premium
ProQuest Health & Medical Complete
ProQuest One Academic UKI Edition
Environmental Science Database
ProQuest Nursing & Allied Health Source (Alumni)
Engineering Research Database
ProQuest One Academic
Meteorological & Geoastrophysical Abstracts - Academic
ProQuest One Academic (New)
Technology Collection
Technology Research Database
ProQuest One Academic Middle East (New)
Materials Science Collection
ProQuest Health & Medical Complete (Alumni)
ProQuest Central (Alumni Edition)
ProQuest One Community College
ProQuest One Health & Nursing
ProQuest Natural Science Collection
ProQuest Pharma Collection
ProQuest Central
ProQuest Health & Medical Research Collection
Genetics Abstracts
ProQuest Engineering Collection
Biotechnology Research Abstracts
Health and Medicine Complete (Alumni Edition)
ProQuest Central Korea
Bacteriology Abstracts (Microbiology B)
Algology Mycology and Protozoology Abstracts (Microbiology C)
Agricultural & Environmental Science Collection
AIDS and Cancer Research Abstracts
Materials Science Database
ProQuest Materials Science Collection
ProQuest Public Health
ProQuest Nursing & Allied Health Source
ProQuest SciTech Collection
Advanced Technologies & Aerospace Database
ProQuest Medical Library
Animal Behavior Abstracts
Materials Science & Engineering Collection
Immunology Abstracts
ProQuest Central (Alumni)
MEDLINE - Academic
DatabaseTitleList

MEDLINE

MEDLINE - Academic


Agricultural Science Database
Database_xml – sequence: 1
  dbid: DOA
  name: DOAJ Directory of Open Access Journals
  url: https://www.doaj.org/
  sourceTypes: Open Website
– sequence: 2
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 3
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 4
  dbid: 8FG
  name: ProQuest Technology Collection
  url: https://search.proquest.com/technologycollection1
  sourceTypes: Aggregation Database
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
DocumentTitleAlternate GenoCore
EISSN 1932-6203
ExternalDocumentID 1921160639
oai_doaj_org_article_96751c1966c24bfab9bc2d037655895d
PMC5519076
A498922579
28727806
10_1371_journal_pone_0181420
Genre Journal Article
GrantInformation_xml – fundername: ;
  grantid: initiative program
– fundername: ;
  grantid: NRF-2011-0030049
– fundername: ;
  grantid: PJ011929
– fundername: ;
  grantid: NRF-2014M3C9A3064552
GroupedDBID ---
123
29O
2WC
53G
5VS
7RV
7X2
7X7
7XC
88E
8AO
8C1
8CJ
8FE
8FG
8FH
8FI
8FJ
A8Z
AAFWJ
AAUCC
AAWOE
AAYXX
ABDBF
ABIVO
ABJCF
ABUWG
ACGFO
ACIHN
ACIWK
ACPRK
ACUHS
ADBBV
ADRAZ
AEAQA
AENEX
AEUYN
AFKRA
AFPKN
AFRAH
AHMBA
ALIPV
ALMA_UNASSIGNED_HOLDINGS
AOIJS
APEBS
ARAPS
ATCPS
BAWUL
BBNVY
BCNDV
BENPR
BGLVJ
BHPHI
BKEYQ
BPHCQ
BVXVI
BWKFM
CCPQU
CITATION
CS3
D1I
D1J
D1K
DIK
DU5
E3Z
EAP
EAS
EBD
EMOBN
ESX
EX3
F5P
FPL
FYUFA
GROUPED_DOAJ
GX1
HCIFZ
HH5
HMCUK
HYE
IAO
IEA
IGS
IHR
IHW
INH
INR
IOV
IPY
ISE
ISR
ITC
K6-
KB.
KQ8
L6V
LK5
LK8
M0K
M1P
M48
M7P
M7R
M7S
M~E
NAPCQ
O5R
O5S
OK1
OVT
P2P
P62
PATMY
PDBOC
PHGZM
PHGZT
PIMPY
PQQKQ
PROAC
PSQYO
PTHSS
PV9
PYCSY
RNS
RPM
RZL
SV3
TR2
UKHRP
WOQ
WOW
~02
~KM
3V.
BBORY
CGR
CUY
CVF
ECM
EIF
IPNFZ
NPM
RIG
PMFND
7QG
7QL
7QO
7SN
7SS
7T5
7TG
7TM
7U9
7XB
8FD
8FK
AZQEC
C1K
DWQXO
FR3
GNUQQ
H94
K9.
KL.
M7N
P64
PJZUB
PKEHL
PPXIY
PQEST
PQGLB
PQUKI
PRINS
RC3
7X8
ESTFP
PUEGO
5PM
-
02
AAPBV
ABPTK
BBAFP
KM
ID FETCH-LOGICAL-c758t-11d51655411de6bad75532855dba5b14563dc6600ebda18c743a2b7ab921610a3
IEDL.DBID M48
ISSN 1932-6203
IngestDate Sun Sep 04 00:10:42 EDT 2022
Wed Aug 27 01:25:33 EDT 2025
Thu Aug 21 14:09:13 EDT 2025
Mon Sep 08 12:15:04 EDT 2025
Fri Jul 25 10:19:49 EDT 2025
Tue Jun 17 20:26:52 EDT 2025
Tue Jun 10 20:31:33 EDT 2025
Fri Jun 27 04:07:23 EDT 2025
Fri Jun 27 04:45:25 EDT 2025
Thu May 22 21:22:15 EDT 2025
Wed Feb 19 02:01:05 EST 2025
Tue Jul 01 00:51:51 EDT 2025
Thu Apr 24 23:00:43 EDT 2025
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 7
Language English
License This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Creative Commons Attribution License
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c758t-11d51655411de6bad75532855dba5b14563dc6600ebda18c743a2b7ab921610a3
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Competing Interests: The authors have declared that no competing interests exist.
ORCID 0000-0002-0038-461X
OpenAccessLink http://journals.scholarsportal.info/openUrl.xqy?doi=10.1371/journal.pone.0181420
PMID 28727806
PQID 1921160639
PQPubID 1436336
PageCount e0181420
ParticipantIDs plos_journals_1921160639
doaj_primary_oai_doaj_org_article_96751c1966c24bfab9bc2d037655895d
pubmedcentral_primary_oai_pubmedcentral_nih_gov_5519076
proquest_miscellaneous_1922504597
proquest_journals_1921160639
gale_infotracmisc_A498922579
gale_infotracacademiconefile_A498922579
gale_incontextgauss_ISR_A498922579
gale_incontextgauss_IOV_A498922579
gale_healthsolutions_A498922579
pubmed_primary_28727806
crossref_citationtrail_10_1371_journal_pone_0181420
crossref_primary_10_1371_journal_pone_0181420
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2017-07-20
PublicationDateYYYYMMDD 2017-07-20
PublicationDate_xml – month: 07
  year: 2017
  text: 2017-07-20
  day: 20
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: San Francisco
– name: San Francisco, CA USA
PublicationTitle PloS one
PublicationTitleAlternate PLoS One
PublicationYear 2017
Publisher Public Library of Science
Public Library of Science (PLoS)
Publisher_xml – name: Public Library of Science
– name: Public Library of Science (PLoS)
References JP Cook (ref7) 2012; 158
YG Lee (ref8) 2015; 81
OH Frankel (ref1) 1984
C Thachuk (ref4) 2009; 10
B Gouesnard (ref2) 2001; 92
K Zhao (ref6) 2010; 5
SR McCouch (ref9) 2016; 7
PA Wilkinson (ref10) 2012; 13
HD Beukelaer (ref5) 2012; 13
S Wright (ref11) 1978
CE Shannon (ref12) 1948; 27
KW Kim (ref3) 2007; 23
26842267 - Nat Commun. 2016 Feb 04;7:10532
11336240 - J Hered. 2001 Jan-Feb;92(1):93-4
25641104 - Plant J. 2015 Feb;81(4):625-36
22135431 - Plant Physiol. 2012 Feb;158(2):824-34
23174036 - BMC Bioinformatics. 2012 Nov 23;13:312
22943283 - BMC Bioinformatics. 2012 Sep 03;13:219
17586551 - Bioinformatics. 2007 Aug 15;23(16):2155-62
20520727 - PLoS One. 2010 May 24;5(5):e10780
19660135 - BMC Bioinformatics. 2009 Aug 06;10:243
References_xml – volume: 158
  start-page: 824
  year: 2012
  ident: ref7
  article-title: Genetic architecture of maize kernel composition in the nested association mapping and inbred association panels
  publication-title: Plant Physiology
  doi: 10.1104/pp.111.185033
– volume: 7
  start-page: 10532
  year: 2016
  ident: ref9
  article-title: Open access resources for genome-wide association mapping in rice
  publication-title: Nature communications
  doi: 10.1038/ncomms10532
– volume: 27
  start-page: 379
  year: 1948
  ident: ref12
  article-title: A mathematical theory of communication
  publication-title: Bell System Technical Journal
  doi: 10.1002/j.1538-7305.1948.tb01338.x
– volume: 13
  start-page: 219
  year: 2012
  ident: ref10
  article-title: CerealsDB 2.0: an integrated resource for plant breeders and scientists
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-219
– volume: 81
  start-page: 625
  year: 2015
  ident: ref8
  article-title: Development, validation and genetic analysis of a large soybean SNP genotyping array
  publication-title: Plant Journal
  doi: 10.1111/tpj.12755
– year: 1978
  ident: ref11
  article-title: A treatise in four volumes, Volume IV: Variability within and among natural populations
– start-page: 249
  year: 1984
  ident: ref1
  article-title: Crop Genetic Resources: Conservation and Evaluation
– volume: 23
  start-page: 2155
  year: 2007
  ident: ref3
  article-title: PowerCore: a program applying the advanced M strategy with a heuristic search for establishing core sets
  publication-title: Bioinformatics
  doi: 10.1093/bioinformatics/btm313
– volume: 5
  start-page: e10780
  year: 2010
  ident: ref6
  article-title: Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome
  publication-title: PLoS One
  doi: 10.1371/journal.pone.0010780
– volume: 92
  start-page: 93
  year: 2001
  ident: ref2
  article-title: MSTRAT: An algorithm for building germ plasm core collections by maximizing allelic or phenotypic richness
  publication-title: Journal of Heredity
  doi: 10.1093/jhered/92.1.93
– volume: 10
  start-page: 243
  year: 2009
  ident: ref4
  article-title: Core Hunter: an algorithm for sampling genetic resources based on multiple genetic measures
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-10-243
– volume: 13
  start-page: 312
  year: 2012
  ident: ref5
  article-title: Core Hunter II: fast core subset selection based on multiple genetic diversity measures using Mixed Replica search
  publication-title: BMC Bioinformatics
  doi: 10.1186/1471-2105-13-312
– reference: 22135431 - Plant Physiol. 2012 Feb;158(2):824-34
– reference: 23174036 - BMC Bioinformatics. 2012 Nov 23;13:312
– reference: 25641104 - Plant J. 2015 Feb;81(4):625-36
– reference: 26842267 - Nat Commun. 2016 Feb 04;7:10532
– reference: 11336240 - J Hered. 2001 Jan-Feb;92(1):93-4
– reference: 19660135 - BMC Bioinformatics. 2009 Aug 06;10:243
– reference: 22943283 - BMC Bioinformatics. 2012 Sep 03;13:219
– reference: 20520727 - PLoS One. 2010 May 24;5(5):e10780
– reference: 17586551 - Bioinformatics. 2007 Aug 15;23(16):2155-62
SSID ssj0053866
Score 2.4245613
Snippet Selecting core subsets from plant genotype datasets is important for enhancing cost-effectiveness and to shorten the time required for analyses of genome-wide...
SourceID plos
doaj
pubmedcentral
proquest
gale
pubmed
crossref
SourceType Open Website
Open Access Repository
Aggregation Database
Index Database
Enrichment Source
StartPage e0181420
SubjectTerms Access to Information
Algorithms
Analysis
Bioinformatics
Biology and Life Sciences
Biotechnology
Breeding
Collection
Computer and Information Sciences
Computer programs
Cost analysis
Crop science
Crops
Databases, Genetic
Datasets
Datasets as Topic
Gene Frequency
Gene polymorphism
Genetic aspects
Genetic distance
Genetic diversity
Genetic markers
Genome-wide association studies
Genomes
Genomics
Internet
Markers
Methods
Oryza - genetics
Phenotype
Physical Sciences
Picking
Plant breeding
Polymorphism
Polymorphism, Single Nucleotide
Principal Component Analysis
Reproducibility of Results
Research and Analysis Methods
Rice
Single nucleotide polymorphisms
Single-nucleotide polymorphism
Software
Software development
Soybeans
Studies
Triticum - genetics
SummonAdditionalLinks – databaseName: DOAJ Directory of Open Access Journals
  dbid: DOA
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV1bb9MwFLZQn3hBjNsCYxiEBDxki3OxY97KxDSQAAkY2gNS5Fu2SiWp6vT_75zEjRo0aTzwFtUnVfOdiz-nx58JeZ1LC2XP2dgKXcc50zyGaVjENbMuVbyGOMAX-l--8rPz_PNFcbFz1Bf2hA3ywANwxxIYLTMQJ9ykua6VltqkNoG8KIpSFharbyKT7WJqqMGQxZyHjXKZYMfBL0ertnFHKFGV4_neOxNRr9c_VuXZatn6myjn352TO1PR6X1yL3BIOh9--x6545oHZC9kqadvg5T0u4fkN1y2J-3avadz6heoBExVY2mtfEfV8rJdL7qrPxSIK0U5S-qhjLiO-v5wHPAYxd0ndInd4hTVXPGFLcWmUrDyj8j56cefJ2dxOE8hNrAq6GLGbMEAthwuHNfKiqLI0rIorFaFZkClMms4MCCnrWKlAXKhUi0A8hR4YaKyx2TWAIL7hBplcqOAqxgH06AtlYNbSyGTTCltTRKRbAtuZYLYOJ55saz6f9AELDoGrCp0SRVcEpF4vGs1iG3cYv8B_TbaolR2_wEEUBUCqLotgCLyAr1eDftOx4Sv5rksJVQ7ISPyqrdAuYwG-3Eu1cb76tO3X_9g9OP7xOhNMKpbgANQHPZAwDOhDNfE8mBiCUlvJsP7GKNbVHyFsnaMI9-EO7dxe_Pwy3EYvxR77BrXbnob1LODFWZEngxhPiIL6-pUlAmPiJgkwAT66UizuOrVyoGSy0Twp__DV8_I3RRpVSKguh-QWbfeuOdACjt92Of_NVJbX4s
  priority: 102
  providerName: Directory of Open Access Journals
– databaseName: Health & Medical Collection
  dbid: 7X7
  link: http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV1Lb9QwELZguXBBlFcDBQxCAg5p87QTLmipqAoSIAFFPSBZfmVbaUmWdfb_M-N4Q4Mq4BbF49V6xjP-xhl_JuRZURsIe9bEhqsmLlLFYliGedykxmaSNTAPcEP_w0d2fFK8Py1Pw4abC2WV25joA7XpNO6RHyBvV8pwQX29-hnjrVH4dTVcoXGVXPPUZTCf-emYcIEvMxaOy-U8PQjW2V91rd1HoqoCb_m-sBx51v4xNs9Wy85dBjz_rJ-8sCAd3SQ3ApKk88H0O-SKbW-RneCrjr4IhNIvb5Pv8Ngddmv7is6pO0c-YCpbQxvpeiqXCxhmf_aDAnylSGpJHQQT21Pnr8gBu1E8g0KXWDNOkdMVt20plpaClLtDTo7efj08jsOtCrGG3KCP09SUKQMUAQ-WKWl4WeZZVZZGyVKlAKhyoxngIKuMTCsNEENmiksFBgCsJfO7ZNaCBncJ1VIXWgJi0RYWQ1NJC10rXie5lMroJCL5VrlCB8pxvPliKfx3NA6px6ArgSYRwSQRicdeq4Fy4x_yb9BuoywSZvsX3Xohgv-JGhKjVEO4YTorVAOjUTozCYTXsqzq0kTkMVpdDKdPR7cX86Kuaoh5vI7IUy-BpBktVuUs5MY58e7Tt_8Q-vJ5IvQ8CDUdqAO0OJyEgDEhGddEcm8iCa6vJ827OEe3WnHit5NAz-28vbz5ydiMP4qVdq3tNl4GWe0gz4zIvWGaj5qF7DrjVcIiwicOMFH9tKU9P_Oc5QDM64Sz-3__Ww_I9QxhU8Iheu-RWb_e2IcA-nr1yHv2L_kGV40
  priority: 102
  providerName: ProQuest
Title GenoCore: A simple and fast algorithm for core subset selection from large genotype datasets
URI https://www.ncbi.nlm.nih.gov/pubmed/28727806
https://www.proquest.com/docview/1921160639
https://www.proquest.com/docview/1922504597
https://pubmed.ncbi.nlm.nih.gov/PMC5519076
https://doaj.org/article/96751c1966c24bfab9bc2d037655895d
http://dx.doi.org/10.1371/journal.pone.0181420
Volume 12
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwrV3di9NAEF_O3osv4vl10bOuIqgPKUma7CaCSK9cPYU75bTSByHsV3qFXtLrpqD_vTNJGoxU9MGXELKzgczszv5mM_sbQp6HiQa3Z7Sruczc0JfMhWWYu5mvTSBYBuMAN_TPztnpNPwwi2Z7ZFuztVGg3RnaYT2p6Xo5-H794y1M-DdV1QbubzsNVkVuBkhAFQYQxO_D2sQwHDsL2_8KMLsZaw7Q_alnZ4GqePxbb91bLQu7C4r-nlH5yxI1uU1uNdiSjurBcED2TH6HHDSz19KXDcX0q7vkG9wW42JtXtMRtQtkCKYi1zQTtqRiOS_Wi_LyigKgpUhzSS24F1NSWxXNAUtSPJVCl5hFTpHlFTdyKSabgpS9R6aTky_jU7eps-AqiBZK1_d15DPAFXBjmBSaR9EwiKNISxFJHyDWUCsGyMhILfxYAegQgeRCJgHgRU8M75NeDho8JFQJFSoBGEYZWB51LAx0jXniDYWQWnkOGW6Vm6qGhBxrYSzT6s8ah2Ck1lWKJkkbkzjEbXutahKOv8gfo91aWaTQrh4U63nazMg0gVDJV-CAmApCmcHXSBVoDxxuFMVJpB3yBK2e1udRW0eQjsIkTsAL8sQhzyoJpNHIMU9nLjbWpu8_fv0Hoc8XHaEXjVBWgDpAi_XZCPgmpOfqSB51JMEZqE7zIY7RrVZsinR3PkMcCj2343Z389O2GV-KuXe5KTaVDPLcQeTpkAf1MG81C_F2wGOPOYR3JkBH9d2WfHFZsZgDVE88zh7-D1s9IjcDhFseB69_RHrlemMeA1gsZZ_c4DMO13js43Xyrk_2j0_OP130q-2XfuUffgIoYG9L
linkProvider Scholars Portal
linkToHtml http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELbKcoALoryatlCDQMAhbZ52goTQslDt0gcStKgHpOBXtpWWZNlkhfhT_EZm8qJBFXDpLYrHUTIefzN2xt8Q8jiINcCe0bbmMrUDVzIb3DC3U1cbT7AU7AA39A8O2fg4eHcSnqyQn-1ZGEyrbDGxAmqdK9wj30HeLpehQ301_2Zj1Sj8u9qW0KjNYs_8-A5LtuLl5A2M7xPP2317NBrbTVUBW0FsXNquq0OXgReFC8Ok0DwMfS8KQy1FKF0IKHytGMQBRmrhRgpcrPAkFxJeAGIN4cNzr5Crge8znEXRqEspAexgrDme53N3p7GG7XmemW0kxgqwqvg591dVCeh8wWA-y4uLAt0_8zXPOcDdm-RGE7nSYW1qq2TFZLfIaoMNBX3WEFg_v00-w2U-yhfmBR3S4gz5h6nINE1FUVIxm4Jay9OvFMJliiSatADwMiUtqpI8YCcUz7zQGeaoU-SQxW1iiqmsIFXcIceXou-7ZJCBBtcIVUIFSkCEpAw4Xx0JA10jHju-EFIrxyJ-q9xENRTnWGljllT_7TgsdWpdJTgkSTMkFrG7XvOa4uMf8q9x3DpZJOiubuSLadLM9ySGhZirAN6Y8gKZwtdI5WkH4DwMozjUFtnCUU_q064dzCTDII5iwFgeW-RRJYEkHRlmAU3FsiiSyftP_yH08UNP6GkjlOagDtBiffICvgnJv3qSmz1JgBrVa15DG221UiS_JyX0bO324uaHXTM-FDP7MpMvKxlk0YN1rUXu1WbeaRZW8x6PHGYR3psAPdX3W7Kz04ojHRYCscPZ-t9fa4tcGx8d7Cf7k8O9DXLdw5DN4eA5NsmgXCzNfQg4S_mgmuWUfLlsWPkF5GqTBw
linkToPdf http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwtV1Lb9QwELbKVkJcEOXVQKEGgYBDunk7QarQ9rFqKSxVS1EPSMGvbCstybLJCvEX-VXMJM7SoAq49BbF4ygZj78ZO-NvCHkWJApgTytbMZHZgSsiG9wwszNXaY9HGdgBbui_H0V7J8Hb0_B0ifxsz8JgWmWLiTVQq0LiHnkfebvcCB1qPzNpEYc7wzfTbzZWkMI_rW05DW7KLKjNmm7MHPI40D--w3Ku3NzfgbF_7nnD3Y_be7apOGBLiJsr23VV6EbgYeFCR4IrFoa-F4ehEjwULgQbvpIRxAhaKO7GEtwv9wTjAl4O4hDuw3OvkWUGXjLokeWt3dHhUesXAFmiyBze85nbN7ayMS1yvYG0WQHWHL_gHOsaAgtP0ZtOivKyMPjPbM4L7nF4i9w0cS0dNIa4QpZ0fpusGOQo6UtDb_3qDvkMl8V2MdOv6YCW58hOTHmuaMbLivLJGBRdnX2lEExTpNikJUCbrmhZF-wBK6J4IoZOMIOdIsMsbiJTTHQFqfIuObkSjd8jvRw0uEqo5DKQHOInqcE1q5hr6BqzxPE5F0o6FvFb5abSEKBjHY5JWv_VY7AQanSV4pCkZkgsYi96TRsCkH_Ib-G4LWSRvru-UczGqUGDNIFlmisB_CLpBSKDrxHSUw6AfRjGSagsso6jnjZnYRcglA6CJE4AgVlikae1BFJ45DgZxnxelun-h0__IXR81BF6YYSyAtQBWmzOZcA3ITVYR3KtIwlAJDvNq2ijrVbK9PeUhZ6t3V7e_GTRjA_FvL9cF_NaBjn2YNVrkfuNmS80C2t9j8VOZBHWmQAd1Xdb8vOzmkEdcCFxWPTg76-1Tq4DxKTv9kcHD8kND-M5h4FbWSO9ajbXjyAarcRjM80p-XLVyPILj5id_w
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=GenoCore%3A+A+simple+and+fast+algorithm+for+core+subset+selection+from+large+genotype+datasets&rft.jtitle=PloS+one&rft.au=Seongmun+Jeong&rft.au=Jae-Yoon+Kim&rft.au=Soon-Chun+Jeong&rft.au=Sung-Taeg+Kang&rft.date=2017-07-20&rft.pub=Public+Library+of+Science+%28PLoS%29&rft.eissn=1932-6203&rft.volume=12&rft.issue=7&rft.spage=e0181420&rft_id=info:doi/10.1371%2Fjournal.pone.0181420&rft.externalDBID=DOA&rft.externalDocID=oai_doaj_org_article_96751c1966c24bfab9bc2d037655895d
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1932-6203&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1932-6203&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1932-6203&client=summon