Efficient analysis of large datasets and sex bias with ADMIXTURE

Background A number of large genomic datasets are being generated for studies of human ancestry and diseases. The ADMIXTURE program is commonly used to infer individual ancestry from genomic data. Results We describe two improvements to the ADMIXTURE software. The first enables ADMIXTURE to infer an...

Full description

Saved in:
Bibliographic Details
Published inBMC bioinformatics Vol. 17; no. 1; p. 218
Main Authors Shringarpure, Suyash S., Bustamante, Carlos D., Lange, Kenneth, Alexander, David H.
Format Journal Article
LanguageEnglish
Published London BioMed Central 23.05.2016
BioMed Central Ltd
Springer Nature B.V
Subjects
Online AccessGet full text
ISSN1471-2105
1471-2105
DOI10.1186/s12859-016-1082-x

Cover

More Information
Summary:Background A number of large genomic datasets are being generated for studies of human ancestry and diseases. The ADMIXTURE program is commonly used to infer individual ancestry from genomic data. Results We describe two improvements to the ADMIXTURE software. The first enables ADMIXTURE to infer ancestry for a new set of individuals using cluster allele frequencies from a reference set of individuals. Using data from the 1000 Genomes Project, we show that this allows ADMIXTURE to infer ancestry for 10,920 individuals in a few hours (a 5 × speedup). This mode also allows ADMIXTURE to correctly estimate individual ancestry and allele frequencies from a set of related individuals. The second modification allows ADMIXTURE to correctly handle X-chromosome (and other haploid) data from both males and females. We demonstrate increased power to detect sex-biased admixture in African-American individuals from the 1000 Genomes project using this extension. Conclusions These modifications make ADMIXTURE more efficient and versatile, allowing users to extract more information from large genomic datasets.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1471-2105
1471-2105
DOI:10.1186/s12859-016-1082-x