Evolutionary history inferred from the de novo assembly of a nonmodel organism, the blue‐eyed black lemur

Lemurs, the living primates most distantly related to humans, demonstrate incredible diversity in behaviour, life history patterns and adaptive traits. Although many lemur species are endangered within their native Madagascar, there is no high‐quality genome assembly from this taxon, limiting popula...

Full description

Saved in:
Bibliographic Details
Published inMolecular ecology Vol. 24; no. 17; pp. 4392 - 4405
Main Authors Meyer, Wynn K, Venkat, Aarti, Kermany, Amir R, de Geijn, Bryce, Zhang, Sidi, Przeworski, Molly
Format Journal Article
LanguageEnglish
Published England Blackwell Scientific Publications 01.09.2015
Blackwell Publishing Ltd
Subjects
Online AccessGet full text
ISSN0962-1083
1365-294X
1365-294X
DOI10.1111/mec.13327

Cover

More Information
Summary:Lemurs, the living primates most distantly related to humans, demonstrate incredible diversity in behaviour, life history patterns and adaptive traits. Although many lemur species are endangered within their native Madagascar, there is no high‐quality genome assembly from this taxon, limiting population and conservation genetic studies. One critically endangered lemur is the blue‐eyed black lemur Eulemur flavifrons. This species is fixed for blue irises, a convergent trait that evolved at least four times in primates and was subject to positive selection in humans, where 5′ regulatory variation of OCA2 explains most of the brown/blue eye colour differences. We built a de novo genome assembly for E. flavifrons, providing the most complete lemur genome to date, and a high confidence consensus sequence for close sister species E. macaco, the (brown‐eyed) black lemur. From diversity and divergence patterns across the genomes, we estimated a recent split time of the two species (160 Kya) and temporal fluctuations in effective population sizes that accord with known environmental changes. By looking for regions of unusually low diversity, we identified potential signals of directional selection in E. flavifrons at MITF, a melanocyte development gene that regulates OCA2 and has previously been associated with variation in human iris colour, as well as at several other genes involved in melanin biosynthesis in mammals. Our study thus illustrates how whole‐genome sequencing of a few individuals can illuminate the demographic and selection history of nonmodel species.
Bibliography:http://dx.doi.org/10.1111/mec.13327
istex:7A57BB5AC4AB14B8E6B88FA6DB5672E5B4F9A186
ark:/67375/WNG-4G52LNNS-Z
Appendix S1 Sample information and DNA extraction for whole-genome sequenced samples. Appendix S2 Choice of sequencing libraries, library preparation, and sequencing for genome assemblies. Appendix S3 Quality control on raw reads. Appendix S4 Choice of pre-assembly correction method for PE reads. Appendix S5 Choice of assembler. Appendix S6 Evaluation of insert size distributions and resolution of 'bimodal' libraries. Appendix S7 Command line parameters for assembly generation. Appendix S8 Details of memory usage and run times. Appendix S9 Aligning blue-eyed black lemur contigs to black lemur bacterial artificial chromosomes (BACs). Appendix S10 Core Eukaryotic Gene Mapping Approach (CEGMA) analysis. Appendix S11 SNP calling. Appendix S12 Sanger-based resequencing of additional samples within the scaffold containing the OCA2 ortholog. Appendix S13 Simulations of pairwise sequentially Markovian coalescent (PSMC) performance on scaffold data and with a population split. Appendix S14 Estimating species split time. Appendix S15 Choice of parameters for PSMC and scaling of PSMC output and species split time. Appendix S16 Identification of candidate regions for recent positive selection in one species. Appendix S17 Annotation of orthologs of OCA2 and additional human iris pigmentation candidate genes within the blue-eyed black lemur genome. Appendix S18 Identification of candidate regulatory changes within the scaffold containing the OCA2 ortholog. Appendix S19 Calculation of summary statistics from the combined sample using angsd and ngstools. Appendix S20 Assessment of admixture in the combined sample. Appendix S21 Annotation of candidate selected regions and gene ontology analysis. Appendix S22 Genome size estimation. Appendix S23 Identification of neighboring scaffolds to the scaffold containing the OCA2 ortholog. Fig. S1 Overview of assembly and analysis pipeline. Fig. S2 Peak memory consumption and time to completion for genome assembly steps. Fig. S3 Simulations to assess impact of window size, scaffolds, generation time, mutation rate, adjustment for coverage, and population split on output of PSMC. Fig. S4 Coverage distributions for mapped reads, q-mers, and k-mers. Fig. S5 Principal components analysis (PCA) and estimation of admixture proportions indicate the absence of admixture in the combined sample. Fig. S6 Bimodal distributions of estimated insert sizes indicate the presence of artifacts in some library preparations. Fig. S7 Distributions of summary statistics from scans for selection in two-sample and full datasets. Table S1 Read counts for each blue-eyed black lemur library. Table S2 Statistics for Quake-corrected assembly (QCA) and SOAP-corrected assembly (SCA). Table S3 Primers.
ArticleID:MEC13327
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
Present address: Program in Biological and Biomedical Sciences, Harvard Medical School, 25 Shattuck Street, Gordon Hall Room 005, Boston, MA 02115, USA
Present address: Department of Integrative Biology, University of California, VLSB 4130, Berkeley, CA 94704, USA
These authors contributed equally to this work.
M.P., W.K.M. and A.V. designed the study and wrote the manuscript. A.V. performed the de novo genome assembly and characterized its quality. W.K.M. performed the reference-based assembly, identified polymorphic sites, performed demographic inferences and scans for selection, and analysed data from additional samples. A.R.K. performed simulations demonstrating the effects of draft assembly on PSMC inference. B.vdG. inferred potential transcription factor binding sites in the HERC2/OCA2 region. S.Z. annotated pigmentation genes.
Present address: Department of Biological Sciences and Department of Systems Biology, Columbia University, 1002A Fairchild Center, 10th Floor, M.C. 2424, New York, NY 10027, USA
ISSN:0962-1083
1365-294X
1365-294X
DOI:10.1111/mec.13327