Effect of Co-segregating Markers on High-Density Genetic Maps and Prediction of Map Expansion Using Machine Learning Algorithms

Advances in sequencing and genotyping methods have enable cost-effective production of high throughput single nucleotide polymorphism (SNP) markers, making them the choice for linkage mapping. As a result, many laboratories have developed high-throughput SNP assays and built high-density genetic map...

Full description

Saved in:
Bibliographic Details
Published inFrontiers in plant science Vol. 8; p. 1434
Main Authors N’Diaye, Amidou, Haile, Jemanesh K., Fowler, D. Brian, Ammar, Karim, Pozniak, Curtis J.
Format Journal Article
LanguageEnglish
Published Switzerland Frontiers Media S.A 23.08.2017
Subjects
Online AccessGet full text
ISSN1664-462X
1664-462X
DOI10.3389/fpls.2017.01434

Cover

More Information
Summary:Advances in sequencing and genotyping methods have enable cost-effective production of high throughput single nucleotide polymorphism (SNP) markers, making them the choice for linkage mapping. As a result, many laboratories have developed high-throughput SNP assays and built high-density genetic maps. However, the number of markers may, by orders of magnitude, exceed the resolution of recombination for a given population size so that only a minority of markers can accurately be ordered. Another issue attached to the so-called 'large p, small n' problem is that high-density genetic maps inevitably result in many markers clustering at the same position (co-segregating markers). While there are a number of related papers, none have addressed the impact of co-segregating markers on genetic maps. In the present study, we investigated the effects of co-segregating markers on high-density genetic map length and marker order using empirical data from two populations of wheat, Mohawk × Cocorit (durum wheat) and Norstar × Cappelle Desprez (bread wheat). The maps of both populations consisted of 85% co-segregating markers. Our study clearly showed that excess of co-segregating markers can lead to map expansion, but has little effect on markers order. To estimate the inflation factor (IF), we generated a total of 24,473 linkage maps (8,203 maps for Mohawk × Cocorit and 16,270 maps for Norstar × Cappelle Desprez). Using seven machine learning algorithms, we were able to predict with an accuracy of 0.7 the map expansion due to the proportion of co-segregating markers. For example in Mohawk × Cocorit, with 10 and 80% co-segregating markers the length of the map inflated by 4.5 and 16.6%, respectively. Similarly, the map of Norstar × Cappelle Desprez expanded by 3.8 and 11.7% with 10 and 80% co-segregating markers. With the increasing number of markers on SNP-chips, the proportion of co-segregating markers in high-density maps will continue to increase making map expansion unavoidable. Therefore, we suggest developers improve linkage mapping algorithms for efficient analysis of high-throughput data. This study outlines a practical strategy to estimate the IF due to the proportion of co-segregating markers and outlines a method to scale the length of the map accordingly.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
Reviewed by: Eduard Akhunov, Kansas State University, United States; Marco Maccaferri, Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria (CREA), Italy
This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science
Edited by: Agata Gadaleta, Università degli Studi di Bari Aldo Moro, Italy
ISSN:1664-462X
1664-462X
DOI:10.3389/fpls.2017.01434