Effect of Co-segregating Markers on High-Density Genetic Maps and Prediction of Map Expansion Using Machine Learning Algorithms
Advances in sequencing and genotyping methods have enable cost-effective production of high throughput single nucleotide polymorphism (SNP) markers, making them the choice for linkage mapping. As a result, many laboratories have developed high-throughput SNP assays and built high-density genetic map...
Saved in:
| Published in | Frontiers in plant science Vol. 8; p. 1434 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
Switzerland
Frontiers Media S.A
23.08.2017
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1664-462X 1664-462X |
| DOI | 10.3389/fpls.2017.01434 |
Cover
| Summary: | Advances in sequencing and genotyping methods have enable cost-effective production of high throughput single nucleotide polymorphism (SNP) markers, making them the choice for linkage mapping. As a result, many laboratories have developed high-throughput SNP assays and built high-density genetic maps. However, the number of markers may, by orders of magnitude, exceed the resolution of recombination for a given population size so that only a minority of markers can accurately be ordered. Another issue attached to the so-called 'large p, small n' problem is that high-density genetic maps inevitably result in many markers clustering at the same position (co-segregating markers). While there are a number of related papers, none have addressed the impact of co-segregating markers on genetic maps. In the present study, we investigated the effects of co-segregating markers on high-density genetic map length and marker order using empirical data from two populations of wheat, Mohawk × Cocorit (durum wheat) and Norstar × Cappelle Desprez (bread wheat). The maps of both populations consisted of 85% co-segregating markers. Our study clearly showed that excess of co-segregating markers can lead to map expansion, but has little effect on markers order. To estimate the inflation factor (IF), we generated a total of 24,473 linkage maps (8,203 maps for Mohawk × Cocorit and 16,270 maps for Norstar × Cappelle Desprez). Using seven machine learning algorithms, we were able to predict with an accuracy of 0.7 the map expansion due to the proportion of co-segregating markers. For example in Mohawk × Cocorit, with 10 and 80% co-segregating markers the length of the map inflated by 4.5 and 16.6%, respectively. Similarly, the map of Norstar × Cappelle Desprez expanded by 3.8 and 11.7% with 10 and 80% co-segregating markers. With the increasing number of markers on SNP-chips, the proportion of co-segregating markers in high-density maps will continue to increase making map expansion unavoidable. Therefore, we suggest developers improve linkage mapping algorithms for efficient analysis of high-throughput data. This study outlines a practical strategy to estimate the IF due to the proportion of co-segregating markers and outlines a method to scale the length of the map accordingly. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 Reviewed by: Eduard Akhunov, Kansas State University, United States; Marco Maccaferri, Consiglio per la Ricerca in Agricoltura e l’Analisi dell’Economia Agraria (CREA), Italy This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science Edited by: Agata Gadaleta, Università degli Studi di Bari Aldo Moro, Italy |
| ISSN: | 1664-462X 1664-462X |
| DOI: | 10.3389/fpls.2017.01434 |