Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations

In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the National Academy of Sciences - PNAS Vol. 112; no. 4; pp. 1019 - 1024
Main Authors	Hu, Yi-Juan, Li, Yun, Auer, Paul L., Lin, Dan-Yu
Format	Journal Article
Language	English
Published	United States National Academy of Sciences 27.01.2015 National Acad Sciences
Subjects	Biological Sciences DNA Mutational Analysis - methods Genetic Diseases, Inborn - genetics genetic variation Genetics Genomics Genotype Genotype & phenotype Genotypes genotyping Genotyping Techniques - methods Health promotion high-throughput nucleotide sequencing Humans Models, Genetic Mutation Oligonucleotide Array Sequence Analysis - methods Physical Sciences Sampling techniques Simulation Software data integration whole-exome sequencing gene-level association tests genotype imputation linkage disequilibrium
Online Access	Get full text
ISSN	0027-8424 1091-6490 1091-6490
DOI	10.1073/pnas.1406143112

Cover

Abstract	In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available. Significance High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations.
AbstractList	In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women's Health Initiative. The relevant software is freely available.In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women's Health Initiative. The relevant software is freely available. In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women's Health Initiative. The relevant software is freely available. High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations. In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available. High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations. In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available. In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available. Significance High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations.
Author	Auer, Paul L. Li, Yun Hu, Yi-Juan Lin, Dan-Yu
Author_xml	– sequence: 1 givenname: Yi-Juan surname: Hu fullname: Hu, Yi-Juan organization: Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322 – sequence: 2 givenname: Yun surname: Li fullname: Li, Yun organization: Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7264 – sequence: 3 givenname: Paul L. surname: Auer fullname: Auer, Paul L. organization: Joseph J. Zilber School of Public Health, University of Wisconsin, Milwaukee, WI 53201-0413 – sequence: 4 givenname: Dan-Yu surname: Lin fullname: Lin, Dan-Yu organization: Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7420
BackLink	https://www.ncbi.nlm.nih.gov/pubmed/25583502$$D View this record in MEDLINE/PubMed
BookMark	eNqNUk1v1DAQjVAR3RbOnIBIXLhsO_6MfUFCFR-VKnGAnq3ZxNl6lbUXO9kq_x6HtF3oATjZ8rz35s0bnxRHPnhbFC8JnBGo2PnOYzojHCThjBD6pFgQ0GQpuYajYgFAq6XilB8XJyltAEALBc-KYyqEYgLookiXvrfriL3b2xI9dmNyqQxtmeyPwfra-XV-bkqMEcdybX3ox50tG-yxbEMsG5fqsLdxwuW7xZRlUgq1y5LBp_LW9TdlxGjL7dDPb8-Lpy12yb64O0-L608fv198WV59_Xx58eFqWQvK-mVd1Y1QLSoKDIgWrapRr4ATphWtaFM3jGitRCUZlVq3Gle2WYk8-8pS5JKdFjDrDn6H4y12ndlFt8U4GgJmys9M-ZlDfpnyfqbshtXWNrX1fcQDLaAzf1a8uzHrsDeZzhSvssC7O4EYcoCpN9uckO069DYMuZcCRmgFVP0bKgXllEmYbL19BN2EIeZt_UKB5EpIkVGvfzf_4Pp-2xlwPgPqGFKKtv2PPMQjRu3mLebpXfcX3r2VqfDQhVDDM4HoDHg1AzapD_FgVXLBKZ1meTPXWwwG19Elc_2NApEAJH9uTdhPxtDr9Q
CitedBy_id	crossref_primary_10_1016_j_ymgme_2017_04_005 crossref_primary_10_1080_01621459_2018_1514304 crossref_primary_10_1111_prd_12320 crossref_primary_10_1002_sim_9211 crossref_primary_10_1002_gepi_22326 crossref_primary_10_1093_biostatistics_kxy073 crossref_primary_10_1038_srep22851 crossref_primary_10_3389_fped_2017_00176 crossref_primary_10_1016_j_ajhg_2015_05_001 crossref_primary_10_1371_journal_pgen_1007021
Cites_doi	10.1038/nrg2796 10.1002/gepi.20527 10.1038/ng.274 10.1016/j.ajhg.2008.06.024 10.1038/ng.686 10.1093/bioinformatics/btm549 10.1002/gepi.20533 10.1016/j.ajhg.2007.09.006 10.1016/j.ajhg.2011.05.029 10.1146/annurev.genom.9.081307.164242 10.1016/S0197-2456(97)00078-0 10.1093/biomet/66.3.403 10.1038/nature11632 10.1016/j.ajhg.2009.01.005 10.1038/nature06258 10.1016/j.ajhg.2012.08.031 10.1002/gepi.20064 10.1371/journal.pgen.1002793 10.1016/j.ajhg.2010.04.005 10.1073/pnas.1221713110 10.1038/ng.2507 10.1161/CIRCGENETICS.113.000350 10.1016/j.ajhg.2013.06.011 10.1002/gepi.21603 10.1086/500812 10.1371/journal.pgen.1000529 10.1371/journal.pgen.1000384 10.1038/ng.2354 10.1002/gepi.20608 10.1016/j.ajhg.2011.07.015
ContentType	Journal Article
Copyright	Volumes 1–89 and 106–112, copyright as a collective work only; author(s) retains copyright to individual articles Copyright National Academy of Sciences Jan 27, 2015
Copyright_xml	– notice: Volumes 1–89 and 106–112, copyright as a collective work only; author(s) retains copyright to individual articles – notice: Copyright National Academy of Sciences Jan 27, 2015
DBID	FBQ AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QG 7QL 7QP 7QR 7SN 7SS 7T5 7TK 7TM 7TO 7U9 8FD C1K FR3 H94 M7N P64 RC3 7X8 7S9 L.6 5PM ADTOC UNPAY
DOI	10.1073/pnas.1406143112
DatabaseName	AGRIS CrossRef Medline MEDLINE MEDLINE (Ovid) MEDLINE MEDLINE PubMed Animal Behavior Abstracts Bacteriology Abstracts (Microbiology B) Calcium & Calcified Tissue Abstracts Chemoreception Abstracts Ecology Abstracts Entomology Abstracts (Full archive) Immunology Abstracts Neurosciences Abstracts Nucleic Acids Abstracts Oncogenes and Growth Factors Abstracts Virology and AIDS Abstracts Technology Research Database Environmental Sciences and Pollution Management Engineering Research Database AIDS and Cancer Research Abstracts Algology Mycology and Protozoology Abstracts (Microbiology C) Biotechnology and BioEngineering Abstracts Genetics Abstracts MEDLINE - Academic AGRICOLA AGRICOLA - Academic PubMed Central (Full Participant titles) Unpaywall for CDI: Periodical Content Unpaywall
DatabaseTitle	CrossRef MEDLINE Medline Complete MEDLINE with Full Text PubMed MEDLINE (Ovid) Virology and AIDS Abstracts Oncogenes and Growth Factors Abstracts Technology Research Database Nucleic Acids Abstracts Ecology Abstracts Neurosciences Abstracts Biotechnology and BioEngineering Abstracts Environmental Sciences and Pollution Management Entomology Abstracts Genetics Abstracts Animal Behavior Abstracts Bacteriology Abstracts (Microbiology B) Algology Mycology and Protozoology Abstracts (Microbiology C) AIDS and Cancer Research Abstracts Chemoreception Abstracts Immunology Abstracts Engineering Research Database Calcium & Calcified Tissue Abstracts MEDLINE - Academic AGRICOLA AGRICOLA - Academic
DatabaseTitleList	MEDLINE - Academic Virology and AIDS Abstracts CrossRef MEDLINE AGRICOLA
Database_xml	– sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database – sequence: 2 dbid: EIF name: MEDLINE url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search sourceTypes: Index Database – sequence: 3 dbid: UNPAY name: Unpaywall url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/ sourceTypes: Open Access Repository – sequence: 4 dbid: FBQ name: AGRIS url: http://www.fao.org/agris/Centre.asp?Menu_1ID=DB&Menu_2ID=DB1&Language=EN&Content=http://www.fao.org/agris/search?Language=EN sourceTypes: Publisher
DeliveryMethod	fulltext_linktorsrc
Discipline	Sciences (General)
DocumentTitleAlternate	Integration of sequencing and array genotype data
EISSN	1091-6490
EndPage	1024
ExternalDocumentID	10.1073/pnas.1406143112 PMC4313847 3577692811 25583502 10_1073_pnas_1406143112 112_4_1019 26454225 US201600149091
Genre	Journal Article Research Support, N.I.H., Extramural Feature
GrantInformation_xml	– fundername: NCI NIH HHS grantid: P01CA142538 – fundername: NIA NIH HHS grantid: HHSN271201100004C – fundername: NHLBI NIH HHS grantid: HHSN268201100004I – fundername: NHGRI NIH HHS grantid: R01HG006703 – fundername: NCI NIH HHS grantid: R01 CA082659 – fundername: WHI NIH HHS grantid: HHSN268201100004C – fundername: NHLBI NIH HHS grantid: HHSN268201100003I – fundername: NCI NIH HHS grantid: R01CA082659 – fundername: WHI NIH HHS grantid: HHSN268201100002C – fundername: NHLBI NIH HHS grantid: RC2 HL-102926 – fundername: HHS \| National Institutes of Health (NIH) grantid: R01HG006292 – fundername: HHS \| National Institutes of Health (NIH) grantid: R01HG006703 – fundername: HHS \| National Institutes of Health (NIH) grantid: R37GM047845 – fundername: HHS \| National Institutes of Health (NIH) grantid: P01CA142538 – fundername: HHS \| National Institutes of Health (NIH) grantid: R01CA082659
GroupedDBID	--- -DZ -~X .55 .GJ 0R~ 123 29P 2AX 2FS 2WC 3O- 4.4 53G 5RE 5VS 692 6TJ 79B 85S AACGO AAFWJ AANCE AAYJJ ABBHK ABOCM ABPLY ABPPZ ABPTK ABTLG ABZEH ACGOD ACIWK ACKIV ACNCT ACPRK ADULT ADZLD AENEX AEUPB AEXZC AFDAS AFFNX AFOSN AFRAH ALMA_UNASSIGNED_HOLDINGS ASUFR AS~ BKOMP CS3 D0L DCCCD DIK DNJUQ DOOOF DU5 DWIUU E3Z EBS EJD F20 F5P FBQ FRP GX1 HGD HH5 HQ3 HTVGU HYE JAAYA JBMMH JENOY JHFFW JKQEH JLS JLXEF JPM JSG JSODD JST KQ8 L7B LU7 MVM N9A NEJ NHB N~3 O9- OK1 P-O PNE PQQKQ R.V RHF RHI RNA RNS RPM RXW SA0 SJN TAE TN5 UKR VOH VQA W8F WH7 WHG WOQ WOW X7M XFK XSW Y6R YBH YKV YSK ZA5 ZCA ZCG ~02 ~KM ABXSQ ACHIC ADQXQ AQVQM H13 IPSME - 02 0R 1AW 55 AAPBV ABFLS ADACO DZ KM PQEST X XHC AAYXX CITATION CGR CUY CVF ECM EIF NPM 7QG 7QL 7QP 7QR 7SN 7SS 7T5 7TK 7TM 7TO 7U9 8FD C1K FR3 H94 M7N P64 RC3 7X8 7S9 L.6 5PM ADTOC ADXHL AFHIN AFQQW UNPAY
ID	FETCH-LOGICAL-c523t-c7cd58fa82030195f8ca9b041398272dcd3199857632699f9abedb5649be2a463
IEDL.DBID	UNPAY
ISSN	0027-8424 1091-6490
IngestDate	Sun Oct 26 04:00:45 EDT 2025 Tue Sep 30 16:53:43 EDT 2025 Thu Sep 04 19:46:05 EDT 2025 Thu Oct 02 10:00:34 EDT 2025 Mon Jun 30 07:46:41 EDT 2025 Thu Apr 03 07:00:56 EDT 2025 Wed Oct 01 02:36:40 EDT 2025 Thu Apr 24 23:10:02 EDT 2025 Wed Nov 11 00:29:52 EST 2020 Fri May 30 12:01:41 EDT 2025 Wed Dec 27 19:18:35 EST 2023
IsDoiOpenAccess	true
IsOpenAccess	true
IsPeerReviewed	true
IsScholarly	true
Issue	4
Keywords	data integration whole-exome sequencing gene-level association tests genotype imputation linkage disequilibrium
Language	English
LinkModel	DirectLink
MergedId	FETCHMERGED-LOGICAL-c523t-c7cd58fa82030195f8ca9b041398272dcd3199857632699f9abedb5649be2a463
Notes	http://dx.doi.org/10.1073/pnas.1406143112 SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-1 ObjectType-Feature-2 content type line 23 Author contributions: Y.-J.H. and D.-Y.L. designed research; Y.-J.H. and D.-Y.L. performed research; Y.-J.H., Y.L., and P.L.A. analyzed data; and Y.-J.H. and D.-Y.L. wrote the paper. Edited by Elizabeth A. Thompson, University of Washington, Seattle, WA, and approved December 9, 2014 (received for review April 3, 2014)
OpenAccessLink	https://proxy.k.utb.cz/login?url=https://www.pnas.org/content/pnas/112/4/1019.full.pdf
PMID	25583502
PQID	1650648565
PQPubID	42026
PageCount	6
ParticipantIDs	crossref_primary_10_1073_pnas_1406143112 fao_agris_US201600149091 pubmedcentral_primary_oai_pubmedcentral_nih_gov_4313847 pubmed_primary_25583502 jstor_primary_26454225 pnas_primary_112_4_1019 crossref_citationtrail_10_1073_pnas_1406143112 unpaywall_primary_10_1073_pnas_1406143112 proquest_journals_1650648565 proquest_miscellaneous_1803127028 proquest_miscellaneous_1652423602
ProviderPackageCode	RNA PNE CITATION AAYXX
PublicationCentury	2000
PublicationDate	2015-01-27
PublicationDateYYYYMMDD	2015-01-27
PublicationDate_xml	– month: 01 year: 2015 text: 2015-01-27 day: 27
PublicationDecade	2010
PublicationPlace	United States
PublicationPlace_xml	– name: United States – name: Washington
PublicationTitle	Proceedings of the National Academy of Sciences - PNAS
PublicationTitleAlternate	Proc Natl Acad Sci U S A
PublicationYear	2015
Publisher	National Academy of Sciences National Acad Sciences
Publisher_xml	– name: National Academy of Sciences – name: National Acad Sciences
References	e_1_3_3_17_2 e_1_3_3_16_2 e_1_3_3_19_2 e_1_3_3_18_2 e_1_3_3_13_2 e_1_3_3_12_2 e_1_3_3_15_2 e_1_3_3_14_2 e_1_3_3_11_2 e_1_3_3_30_2 e_1_3_3_10_2 e_1_3_3_6_2 e_1_3_3_5_2 e_1_3_3_8_2 e_1_3_3_7_2 e_1_3_3_28_2 e_1_3_3_9_2 e_1_3_3_27_2 e_1_3_3_29_2 e_1_3_3_24_2 e_1_3_3_23_2 e_1_3_3_26_2 e_1_3_3_25_2 e_1_3_3_2_2 e_1_3_3_20_2 e_1_3_3_1_2 e_1_3_3_4_2 e_1_3_3_22_2 e_1_3_3_3_2 e_1_3_3_21_2
References_xml	– ident: e_1_3_3_12_2 doi: 10.1038/nrg2796 – ident: e_1_3_3_13_2 doi: 10.1002/gepi.20527 – ident: e_1_3_3_26_2 doi: 10.1038/ng.274 – ident: e_1_3_3_17_2 doi: 10.1016/j.ajhg.2008.06.024 – ident: e_1_3_3_27_2 doi: 10.1038/ng.686 – ident: e_1_3_3_24_2 doi: 10.1093/bioinformatics/btm549 – ident: e_1_3_3_7_2 doi: 10.1002/gepi.20533 – ident: e_1_3_3_1_2 doi: 10.1016/j.ajhg.2007.09.006 – ident: e_1_3_3_21_2 doi: 10.1016/j.ajhg.2011.05.029 – ident: e_1_3_3_11_2 doi: 10.1146/annurev.genom.9.081307.164242 – ident: e_1_3_3_5_2 doi: 10.1016/S0197-2456(97)00078-0 – ident: e_1_3_3_3_2 doi: 10.1093/biomet/66.3.403 – ident: e_1_3_3_16_2 doi: 10.1038/nature11632 – ident: e_1_3_3_9_2 doi: 10.1016/j.ajhg.2009.01.005 – ident: e_1_3_3_15_2 doi: 10.1038/nature06258 – ident: e_1_3_3_6_2 doi: 10.1016/j.ajhg.2012.08.031 – ident: e_1_3_3_25_2 doi: 10.1002/gepi.20064 – ident: e_1_3_3_28_2 doi: 10.1371/journal.pgen.1002793 – ident: e_1_3_3_19_2 doi: 10.1016/j.ajhg.2010.04.005 – ident: e_1_3_3_2_2 doi: 10.1073/pnas.1221713110 – ident: e_1_3_3_30_2 doi: 10.1038/ng.2507 – ident: e_1_3_3_4_2 doi: 10.1161/CIRCGENETICS.113.000350 – ident: e_1_3_3_23_2 doi: 10.1016/j.ajhg.2013.06.011 – ident: e_1_3_3_29_2 doi: 10.1002/gepi.21603 – ident: e_1_3_3_22_2 doi: 10.1086/500812 – ident: e_1_3_3_8_2 doi: 10.1371/journal.pgen.1000529 – ident: e_1_3_3_18_2 doi: 10.1371/journal.pgen.1000384 – ident: e_1_3_3_10_2 doi: 10.1038/ng.2354 – ident: e_1_3_3_14_2 doi: 10.1002/gepi.20608 – ident: e_1_3_3_20_2 doi: 10.1016/j.ajhg.2011.07.015
SSID	ssj0009580
Score	2.2291532
Snippet	In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A... High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However,...
SourceID	unpaywall pubmedcentral proquest pubmed crossref pnas jstor fao
SourceType	Open Access Repository Aggregation Database Index Database Enrichment Source Publisher
StartPage	1019
SubjectTerms	Biological Sciences DNA Mutational Analysis - methods Genetic Diseases, Inborn - genetics genetic variation Genetics Genomics Genotype Genotype & phenotype Genotypes genotyping Genotyping Techniques - methods Health promotion high-throughput nucleotide sequencing Humans Models, Genetic Mutation Oligonucleotide Array Sequence Analysis - methods Physical Sciences Sampling techniques Simulation Software
Title	Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations
URI	https://www.jstor.org/stable/26454225 http://www.pnas.org/content/112/4/1019.abstract https://www.ncbi.nlm.nih.gov/pubmed/25583502 https://www.proquest.com/docview/1650648565 https://www.proquest.com/docview/1652423602 https://www.proquest.com/docview/1803127028 https://pubmed.ncbi.nlm.nih.gov/PMC4313847 https://www.pnas.org/content/pnas/112/4/1019.full.pdf
UnpaywallVersion	publishedVersion
Volume	112
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
journalDatabaseRights	– providerCode: PRVFSB databaseName: Free Journals in Chemistry customDbUrl: eissn: 1091-6490 dateEnd: 20250502 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: HH5 dateStart: 19150101 isFulltext: true titleUrlDefault: http://abc-chemistry.org/ providerName: ABC ChemistRy – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: KQ8 dateStart: 19150101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: KQ8 dateStart: 19150115 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: DIK dateStart: 19150101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1091-6490 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: GX1 dateStart: 0 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1091-6490 dateEnd: 20250502 omitProxy: true ssIdentifier: ssj0009580 issn: 0027-8424 databaseCode: RPM dateStart: 19150101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB11twe4FAqUBkplJA7tIdnEcb6OFaIUJCokWLScIsexAbFkV5uNqvLrmUmcLMunek0mTuyMx8_2m2eAZ1obqXwVuGXChYv4H-NglPquSrlIlYklb0V93lzGF1PxehbNdiDqc2GIVrmsZN1u4hNbG0PvhC5MEBhMBHbyIPNoZdpblmYEu3GEEHwMu9PLt2cfOzoHvk10h9niWOjGIvN7SZ8kbIvC6EDzoDAI-NZoNDJy0dMSSesUTf-EO3-nT95qqqW8vpLz-U9j0_kd-NDXqqOkfPWadeGp778IPt642ndhz6JVdta51z7s6Ooe7Nt4ULMTK1p9eh_qV1Z3AqMnk1bphC0Ms1xtHCHxcsnkaiWvGSnD0uIvI4IqQ9zMKDuY2KRkZzeNmNy4Ts1owZjhzF6zb01HH6gfwPT8xfvnF6490MFVON9duypRZZQaiagjpERFkyqZFT6Oo1nKE16qMqSUP5wCIajMMpPJQpdFhL-w0FyKODyAcbWo9CEwxBXCBKXRCDiE5qWURgVGkeBArOLQOOD1PzZXVu2cDt2Y5-2uexLm1Lr5xhMcOBkeWHZCH383PURPyeUnDMP59B0nkT6aaaK7OXDQus9QBCfFNIyZDjxsSxmKDnguiF-XOXDUu1hu4we-LCYhwRTRtgNPh9vY82k7R1Z60bQ2BIZjn__DJsWgTSmHKX1A67WbT4sihN_0dLLlz4MBKY9v36m-fG4VyLEZQoQ1DpwOnv-_Rnt0A9vHcBsblUilLk-OYLxeNfoJYr91cQyjl7Pg2Pb3H602VYM
linkProvider	Unpaywall
linkToUnpaywall	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB212wNcgAKlgYKMxKE9JJs4jpMcK0RVkKiQYFE5RY5jA2LJrjYbofLrmUmcLMunek0mTuyMx8_2m2eAZ8ZYpUMd-VXKhY_4H-NgkoW-zrjItJWKd6I-ry_k-Uy8ukwudyAZcmGIVrmsVdNt4hNbG0PvlC5MERhMBXbyKA9oZTpYVnYX9mSCEHwCe7OLN6cfejoHvk30h9niWOhLkYeDpE8ad0VhdKB5UBxFfGs02rVqMdASSesUTf-EO3-nT95o66W6-qbm85_GprPb8H6oVU9J-RK06zLQ338RfLx2te_ALYdW2WnvXvuwY-q7sO_iQcOOnWj1yT1oXjrdCYyeTDmlE7awzHG1cYTEyxVTq5W6YqQMS4u_jAiqDHEzo-xgYpOSnds0YmrjOg2jBWOGM3vDvrY9faC5D7OzF--en_vuQAdf43x37etUV0lmFaKOmBIVbaZVXoY4juYZT3mlq5hS_nAKhKAyz22uSlOVCf7C0nAlZHwAk3pRm0NgiCuEjSprEHAIwyulrI6sJsEBqWVsPQiGH1top3ZOh27Mi27XPY0Lat1i4wkeHI8PLHuhj7-bHqKnFOojhuFi9paTSB_NNNHdPDjo3GcsgpNiGsZMDx50pYxFR7wQxK_LPTgaXKxw8QNfJklIMEO07cHT8Tb2fNrOUbVZtJ0NgWEZ8n_YZBi0KeUwow_ovHbzaUmC8JueTrf8eTQg5fHtO_XnT50COTZDjLDGg5PR8__XaA-vYfsIbmKjEqnU5-kRTNar1jxG7Lcun7ie_gOxzlSS
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Integrative+analysis+of+sequencing+and+array+genotype+data+for+discovering+disease+associations+with+rare+mutations&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences+-+PNAS&rft.au=Hu%2C+Yi-Juan&rft.au=Li%2C+Yun&rft.au=Auer%2C+Paul+L.&rft.au=Lin%2C+Dan-Yu&rft.date=2015-01-27&rft.issn=0027-8424&rft.eissn=1091-6490&rft.volume=112&rft.issue=4&rft.spage=1019&rft.epage=1024&rft_id=info:doi/10.1073%2Fpnas.1406143112&rft.externalDBID=n%2Fa&rft.externalDocID=10_1073_pnas_1406143112
thumbnail_m	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.pnas.org%2Fcontent%2F112%2F4.cover.gif
thumbnail_s	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.pnas.org%2Fcontent%2F112%2F4.cover.gif