Dynamic model based algorithms for screening and genotyping over 100K SNPs on oligonucleotide microarrays

Motivation: A high density of single nucleotide polymorphism (SNP) coverage on the genome is desirable and often an essential requirement for population genetics studies. Region-specific or chromosome-specific linkage studies also benefit from the availability of as many high quality SNPs as possibl...

Full description

Saved in:

Bibliographic Details
Published in	Bioinformatics Vol. 21; no. 9; pp. 1958 - 1963
Main Authors	Di, Xiaojun, Matsuzaki, Hajime, Webster, Teresa A., Hubbell, Earl, Liu, Guoying, Dong, Shoulian, Bartell, Dan, Huang, Jing, Chiles, Richard, Yang, Geoffrey, Shen, Mei-mei, Kulp, David, Kennedy, Giulia C., Mei, Rui, Jones, Keith W., Cawley, Simon
Format	Journal Article
Language	English
Published	Oxford Oxford University Press 01.05.2005 Oxford Publishing Limited (England)
Subjects	Algorithms Biological and medical sciences Fundamental and applied biological sciences. Psychology General aspects Genetics Genotypes Mathematics in biology. Statistical analysis. Models. Metrology. Data processing in biology (general aspects) Performance assessment Population genetics Population studies Screening Typing DNA chip Genotype Single nucleotide polymorphism Dynamic model Microarray Algorithm Density
Online Access	Get full text
ISSN	1367-4803 0266-7061 1460-2059 1460-2059 1367-4811
DOI	10.1093/bioinformatics/bti275

Cover

More Information
Summary:	Motivation: A high density of single nucleotide polymorphism (SNP) coverage on the genome is desirable and often an essential requirement for population genetics studies. Region-specific or chromosome-specific linkage studies also benefit from the availability of as many high quality SNPs as possible. The availability of millions of SNPs from both Perlegen and the public domain and the development of an efficient microarray-based assay for genotyping SNPs has brought up some interesting analytical challenges. Effective methods for the selection of optimal subsets of SNPs spanning the genome and methods for accurately calling genotypes from probe hybridization patterns have enabled the development of a new microarray-based system for robustly genotyping over 100 000 SNPs per sample. Results: We introduce a new dynamic model-based algorithm (DM) for screening over 3 million SNPs and genotyping over 100 000 SNPs. The model is based on four possible underlying states: Null, A, AB and B for each probe quartet. We calculate a probe-level log likelihood for each model and then select between the four competing models with an SNP-level statistical aggregation across multiple probe quartets to provide a high-quality genotype call along with a quality measure of the call. We assess performance with HapMap reference genotypes, informative Mendelian inheritance relationship in families, and consistency between DM and another genotype classification method. At a call rate of 95.91% the concordance with reference genotypes from the HapMap Project is 99.81% based on over 1.5 million genotypes, the Mendelian error rate is 0.018% based on 10 trios, and the consistency between DM and MPAM is 99.90% at a comparable rate of 97.18%. We also develop methods for SNP selection and optimal probe selection. Availability: The DM algorithm is available in Affymetrix's Genotyping Tools software package and in Affymetrix's GDAS software package. See http://www.affymetrix.com for further information. 10K and 100K mapping array data are available on the Affymetrix website. Contact: xiaojun_di@affymetrix.com
Bibliography:	ark:/67375/HXZ-2TFB793G-W To whom correspondence should be addressed. istex:A2DAAFC6DD7B35F99B0F44F9F273D55301777AC6 local:bti275 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 ObjectType-Article-2 ObjectType-Feature-1 content type line 23
ISSN:	1367-4803 0266-7061 1460-2059 1460-2059 1367-4811
DOI:	10.1093/bioinformatics/bti275