Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations

In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the National Academy of Sciences - PNAS Vol. 112; no. 4; pp. 1019 - 1024
Main Authors Hu, Yi-Juan, Li, Yun, Auer, Paul L., Lin, Dan-Yu
Format Journal Article
LanguageEnglish
Published United States National Academy of Sciences 27.01.2015
National Acad Sciences
Subjects
Online AccessGet full text
ISSN0027-8424
1091-6490
1091-6490
DOI10.1073/pnas.1406143112

Cover

Abstract In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available. Significance High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations.
AbstractList In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women's Health Initiative. The relevant software is freely available.In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women's Health Initiative. The relevant software is freely available.
In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women's Health Initiative. The relevant software is freely available.
High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations. In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available.
High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations. In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available.
In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A cost-effective strategy is to sequence subjects with extreme values of quantitative traits or those with specific diseases. By imputing the sequencing data from the GWAS data for the cohort members who are not selected for sequencing, one can dramatically increase the number of subjects with information on rare variants. However, ignoring the uncertainties of imputed rare variants in downstream association analysis will inflate the type I error when sequenced subjects are not a random subset of the GWAS subjects. In this article, we provide a valid and efficient approach to combining observed and imputed data on rare variants. We consider commonly used gene-level association tests, all of which are constructed from the score statistic for assessing the effects of individual variants on the trait of interest. We show that the score statistic based on the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects is unbiased. We derive a robust variance estimator that reflects the true variability of the score statistic regardless of the sampling scheme and imputation quality, such that the corresponding association tests always have correct type I error. We demonstrate through extensive simulation studies that the proposed tests are substantially more powerful than the use of accurately imputed variants only and the use of sequencing data alone. We provide an application to the Women’s Health Initiative. The relevant software is freely available. Significance High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However, sequencing a large number of subjects is prohibitively expensive. It is common to select subjects for sequencing from the cohorts that have collected genotyping array data. We impute the sequencing data from the array data for the cohort members who are not selected for sequencing and perform gene-level association tests for rare variants by properly combining the observed genotypes for sequenced subjects and the imputed genotypes for nonsequenced subjects. This integrative analysis is substantially more powerful than the use of sequencing data alone and can accelerate the search for disease-causing mutations.
Author Auer, Paul L.
Li, Yun
Hu, Yi-Juan
Lin, Dan-Yu
Author_xml – sequence: 1
  givenname: Yi-Juan
  surname: Hu
  fullname: Hu, Yi-Juan
  organization: Department of Biostatistics and Bioinformatics, Emory University, Atlanta, GA 30322
– sequence: 2
  givenname: Yun
  surname: Li
  fullname: Li, Yun
  organization: Department of Genetics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7264
– sequence: 3
  givenname: Paul L.
  surname: Auer
  fullname: Auer, Paul L.
  organization: Joseph J. Zilber School of Public Health, University of Wisconsin, Milwaukee, WI 53201-0413
– sequence: 4
  givenname: Dan-Yu
  surname: Lin
  fullname: Lin, Dan-Yu
  organization: Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599-7420
BackLink https://www.ncbi.nlm.nih.gov/pubmed/25583502$$D View this record in MEDLINE/PubMed
BookMark eNqNUk1v1DAQjVAR3RbOnIBIXLhsO_6MfUFCFR-VKnGAnq3ZxNl6lbUXO9kq_x6HtF3oATjZ8rz35s0bnxRHPnhbFC8JnBGo2PnOYzojHCThjBD6pFgQ0GQpuYajYgFAq6XilB8XJyltAEALBc-KYyqEYgLookiXvrfriL3b2xI9dmNyqQxtmeyPwfra-XV-bkqMEcdybX3ox50tG-yxbEMsG5fqsLdxwuW7xZRlUgq1y5LBp_LW9TdlxGjL7dDPb8-Lpy12yb64O0-L608fv198WV59_Xx58eFqWQvK-mVd1Y1QLSoKDIgWrapRr4ATphWtaFM3jGitRCUZlVq3Gle2WYk8-8pS5JKdFjDrDn6H4y12ndlFt8U4GgJmys9M-ZlDfpnyfqbshtXWNrX1fcQDLaAzf1a8uzHrsDeZzhSvssC7O4EYcoCpN9uckO069DYMuZcCRmgFVP0bKgXllEmYbL19BN2EIeZt_UKB5EpIkVGvfzf_4Pp-2xlwPgPqGFKKtv2PPMQjRu3mLebpXfcX3r2VqfDQhVDDM4HoDHg1AzapD_FgVXLBKZ1meTPXWwwG19Elc_2NApEAJH9uTdhPxtDr9Q
CitedBy_id crossref_primary_10_1016_j_ymgme_2017_04_005
crossref_primary_10_1080_01621459_2018_1514304
crossref_primary_10_1111_prd_12320
crossref_primary_10_1002_sim_9211
crossref_primary_10_1002_gepi_22326
crossref_primary_10_1093_biostatistics_kxy073
crossref_primary_10_1038_srep22851
crossref_primary_10_3389_fped_2017_00176
crossref_primary_10_1016_j_ajhg_2015_05_001
crossref_primary_10_1371_journal_pgen_1007021
Cites_doi 10.1038/nrg2796
10.1002/gepi.20527
10.1038/ng.274
10.1016/j.ajhg.2008.06.024
10.1038/ng.686
10.1093/bioinformatics/btm549
10.1002/gepi.20533
10.1016/j.ajhg.2007.09.006
10.1016/j.ajhg.2011.05.029
10.1146/annurev.genom.9.081307.164242
10.1016/S0197-2456(97)00078-0
10.1093/biomet/66.3.403
10.1038/nature11632
10.1016/j.ajhg.2009.01.005
10.1038/nature06258
10.1016/j.ajhg.2012.08.031
10.1002/gepi.20064
10.1371/journal.pgen.1002793
10.1016/j.ajhg.2010.04.005
10.1073/pnas.1221713110
10.1038/ng.2507
10.1161/CIRCGENETICS.113.000350
10.1016/j.ajhg.2013.06.011
10.1002/gepi.21603
10.1086/500812
10.1371/journal.pgen.1000529
10.1371/journal.pgen.1000384
10.1038/ng.2354
10.1002/gepi.20608
10.1016/j.ajhg.2011.07.015
ContentType Journal Article
Copyright Volumes 1–89 and 106–112, copyright as a collective work only; author(s) retains copyright to individual articles
Copyright National Academy of Sciences Jan 27, 2015
Copyright_xml – notice: Volumes 1–89 and 106–112, copyright as a collective work only; author(s) retains copyright to individual articles
– notice: Copyright National Academy of Sciences Jan 27, 2015
DBID FBQ
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QG
7QL
7QP
7QR
7SN
7SS
7T5
7TK
7TM
7TO
7U9
8FD
C1K
FR3
H94
M7N
P64
RC3
7X8
7S9
L.6
5PM
ADTOC
UNPAY
DOI 10.1073/pnas.1406143112
DatabaseName AGRIS
CrossRef
Medline
MEDLINE
MEDLINE (Ovid)
MEDLINE
MEDLINE
PubMed
Animal Behavior Abstracts
Bacteriology Abstracts (Microbiology B)
Calcium & Calcified Tissue Abstracts
Chemoreception Abstracts
Ecology Abstracts
Entomology Abstracts (Full archive)
Immunology Abstracts
Neurosciences Abstracts
Nucleic Acids Abstracts
Oncogenes and Growth Factors Abstracts
Virology and AIDS Abstracts
Technology Research Database
Environmental Sciences and Pollution Management
Engineering Research Database
AIDS and Cancer Research Abstracts
Algology Mycology and Protozoology Abstracts (Microbiology C)
Biotechnology and BioEngineering Abstracts
Genetics Abstracts
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
PubMed Central (Full Participant titles)
Unpaywall for CDI: Periodical Content
Unpaywall
DatabaseTitle CrossRef
MEDLINE
Medline Complete
MEDLINE with Full Text
PubMed
MEDLINE (Ovid)
Virology and AIDS Abstracts
Oncogenes and Growth Factors Abstracts
Technology Research Database
Nucleic Acids Abstracts
Ecology Abstracts
Neurosciences Abstracts
Biotechnology and BioEngineering Abstracts
Environmental Sciences and Pollution Management
Entomology Abstracts
Genetics Abstracts
Animal Behavior Abstracts
Bacteriology Abstracts (Microbiology B)
Algology Mycology and Protozoology Abstracts (Microbiology C)
AIDS and Cancer Research Abstracts
Chemoreception Abstracts
Immunology Abstracts
Engineering Research Database
Calcium & Calcified Tissue Abstracts
MEDLINE - Academic
AGRICOLA
AGRICOLA - Academic
DatabaseTitleList MEDLINE - Academic
Virology and AIDS Abstracts
CrossRef

MEDLINE


AGRICOLA
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
– sequence: 2
  dbid: EIF
  name: MEDLINE
  url: https://proxy.k.utb.cz/login?url=https://www.webofscience.com/wos/medline/basic-search
  sourceTypes: Index Database
– sequence: 3
  dbid: UNPAY
  name: Unpaywall
  url: https://proxy.k.utb.cz/login?url=https://unpaywall.org/
  sourceTypes: Open Access Repository
– sequence: 4
  dbid: FBQ
  name: AGRIS
  url: http://www.fao.org/agris/Centre.asp?Menu_1ID=DB&Menu_2ID=DB1&Language=EN&Content=http://www.fao.org/agris/search?Language=EN
  sourceTypes: Publisher
DeliveryMethod fulltext_linktorsrc
Discipline Sciences (General)
DocumentTitleAlternate Integration of sequencing and array genotype data
EISSN 1091-6490
EndPage 1024
ExternalDocumentID 10.1073/pnas.1406143112
PMC4313847
3577692811
25583502
10_1073_pnas_1406143112
112_4_1019
26454225
US201600149091
Genre Journal Article
Research Support, N.I.H., Extramural
Feature
GrantInformation_xml – fundername: NCI NIH HHS
  grantid: P01CA142538
– fundername: NIA NIH HHS
  grantid: HHSN271201100004C
– fundername: NHLBI NIH HHS
  grantid: HHSN268201100004I
– fundername: NHGRI NIH HHS
  grantid: R01HG006703
– fundername: NCI NIH HHS
  grantid: R01 CA082659
– fundername: WHI NIH HHS
  grantid: HHSN268201100004C
– fundername: NHLBI NIH HHS
  grantid: HHSN268201100003I
– fundername: NCI NIH HHS
  grantid: R01CA082659
– fundername: WHI NIH HHS
  grantid: HHSN268201100002C
– fundername: NHLBI NIH HHS
  grantid: RC2 HL-102926
– fundername: HHS | National Institutes of Health (NIH)
  grantid: R01HG006292
– fundername: HHS | National Institutes of Health (NIH)
  grantid: R01HG006703
– fundername: HHS | National Institutes of Health (NIH)
  grantid: R37GM047845
– fundername: HHS | National Institutes of Health (NIH)
  grantid: P01CA142538
– fundername: HHS | National Institutes of Health (NIH)
  grantid: R01CA082659
GroupedDBID ---
-DZ
-~X
.55
.GJ
0R~
123
29P
2AX
2FS
2WC
3O-
4.4
53G
5RE
5VS
692
6TJ
79B
85S
AACGO
AAFWJ
AANCE
AAYJJ
ABBHK
ABOCM
ABPLY
ABPPZ
ABPTK
ABTLG
ABZEH
ACGOD
ACIWK
ACKIV
ACNCT
ACPRK
ADULT
ADZLD
AENEX
AEUPB
AEXZC
AFDAS
AFFNX
AFOSN
AFRAH
ALMA_UNASSIGNED_HOLDINGS
ASUFR
AS~
BKOMP
CS3
D0L
DCCCD
DIK
DNJUQ
DOOOF
DU5
DWIUU
E3Z
EBS
EJD
F20
F5P
FBQ
FRP
GX1
HGD
HH5
HQ3
HTVGU
HYE
JAAYA
JBMMH
JENOY
JHFFW
JKQEH
JLS
JLXEF
JPM
JSG
JSODD
JST
KQ8
L7B
LU7
MVM
N9A
NEJ
NHB
N~3
O9-
OK1
P-O
PNE
PQQKQ
R.V
RHF
RHI
RNA
RNS
RPM
RXW
SA0
SJN
TAE
TN5
UKR
VOH
VQA
W8F
WH7
WHG
WOQ
WOW
X7M
XFK
XSW
Y6R
YBH
YKV
YSK
ZA5
ZCA
ZCG
~02
~KM
ABXSQ
ACHIC
ADQXQ
AQVQM
H13
IPSME
-
02
0R
1AW
55
AAPBV
ABFLS
ADACO
DZ
KM
PQEST
X
XHC
AAYXX
CITATION
CGR
CUY
CVF
ECM
EIF
NPM
7QG
7QL
7QP
7QR
7SN
7SS
7T5
7TK
7TM
7TO
7U9
8FD
C1K
FR3
H94
M7N
P64
RC3
7X8
7S9
L.6
5PM
ADTOC
ADXHL
AFHIN
AFQQW
UNPAY
ID FETCH-LOGICAL-c523t-c7cd58fa82030195f8ca9b041398272dcd3199857632699f9abedb5649be2a463
IEDL.DBID UNPAY
ISSN 0027-8424
1091-6490
IngestDate Sun Oct 26 04:00:45 EDT 2025
Tue Sep 30 16:53:43 EDT 2025
Thu Sep 04 19:46:05 EDT 2025
Thu Oct 02 10:00:34 EDT 2025
Mon Jun 30 07:46:41 EDT 2025
Thu Apr 03 07:00:56 EDT 2025
Wed Oct 01 02:36:40 EDT 2025
Thu Apr 24 23:10:02 EDT 2025
Wed Nov 11 00:29:52 EST 2020
Fri May 30 12:01:41 EDT 2025
Wed Dec 27 19:18:35 EST 2023
IsDoiOpenAccess true
IsOpenAccess true
IsPeerReviewed true
IsScholarly true
Issue 4
Keywords data integration
whole-exome sequencing
gene-level association tests
genotype imputation
linkage disequilibrium
Language English
LinkModel DirectLink
MergedId FETCHMERGED-LOGICAL-c523t-c7cd58fa82030195f8ca9b041398272dcd3199857632699f9abedb5649be2a463
Notes http://dx.doi.org/10.1073/pnas.1406143112
SourceType-Scholarly Journals-1
ObjectType-Feature-1
content type line 14
ObjectType-Article-1
ObjectType-Feature-2
content type line 23
Author contributions: Y.-J.H. and D.-Y.L. designed research; Y.-J.H. and D.-Y.L. performed research; Y.-J.H., Y.L., and P.L.A. analyzed data; and Y.-J.H. and D.-Y.L. wrote the paper.
Edited by Elizabeth A. Thompson, University of Washington, Seattle, WA, and approved December 9, 2014 (received for review April 3, 2014)
OpenAccessLink https://proxy.k.utb.cz/login?url=https://www.pnas.org/content/pnas/112/4/1019.full.pdf
PMID 25583502
PQID 1650648565
PQPubID 42026
PageCount 6
ParticipantIDs crossref_primary_10_1073_pnas_1406143112
fao_agris_US201600149091
pubmedcentral_primary_oai_pubmedcentral_nih_gov_4313847
pubmed_primary_25583502
jstor_primary_26454225
pnas_primary_112_4_1019
crossref_citationtrail_10_1073_pnas_1406143112
unpaywall_primary_10_1073_pnas_1406143112
proquest_journals_1650648565
proquest_miscellaneous_1803127028
proquest_miscellaneous_1652423602
ProviderPackageCode RNA
PNE
CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2015-01-27
PublicationDateYYYYMMDD 2015-01-27
PublicationDate_xml – month: 01
  year: 2015
  text: 2015-01-27
  day: 27
PublicationDecade 2010
PublicationPlace United States
PublicationPlace_xml – name: United States
– name: Washington
PublicationTitle Proceedings of the National Academy of Sciences - PNAS
PublicationTitleAlternate Proc Natl Acad Sci U S A
PublicationYear 2015
Publisher National Academy of Sciences
National Acad Sciences
Publisher_xml – name: National Academy of Sciences
– name: National Acad Sciences
References e_1_3_3_17_2
e_1_3_3_16_2
e_1_3_3_19_2
e_1_3_3_18_2
e_1_3_3_13_2
e_1_3_3_12_2
e_1_3_3_15_2
e_1_3_3_14_2
e_1_3_3_11_2
e_1_3_3_30_2
e_1_3_3_10_2
e_1_3_3_6_2
e_1_3_3_5_2
e_1_3_3_8_2
e_1_3_3_7_2
e_1_3_3_28_2
e_1_3_3_9_2
e_1_3_3_27_2
e_1_3_3_29_2
e_1_3_3_24_2
e_1_3_3_23_2
e_1_3_3_26_2
e_1_3_3_25_2
e_1_3_3_2_2
e_1_3_3_20_2
e_1_3_3_1_2
e_1_3_3_4_2
e_1_3_3_22_2
e_1_3_3_3_2
e_1_3_3_21_2
References_xml – ident: e_1_3_3_12_2
  doi: 10.1038/nrg2796
– ident: e_1_3_3_13_2
  doi: 10.1002/gepi.20527
– ident: e_1_3_3_26_2
  doi: 10.1038/ng.274
– ident: e_1_3_3_17_2
  doi: 10.1016/j.ajhg.2008.06.024
– ident: e_1_3_3_27_2
  doi: 10.1038/ng.686
– ident: e_1_3_3_24_2
  doi: 10.1093/bioinformatics/btm549
– ident: e_1_3_3_7_2
  doi: 10.1002/gepi.20533
– ident: e_1_3_3_1_2
  doi: 10.1016/j.ajhg.2007.09.006
– ident: e_1_3_3_21_2
  doi: 10.1016/j.ajhg.2011.05.029
– ident: e_1_3_3_11_2
  doi: 10.1146/annurev.genom.9.081307.164242
– ident: e_1_3_3_5_2
  doi: 10.1016/S0197-2456(97)00078-0
– ident: e_1_3_3_3_2
  doi: 10.1093/biomet/66.3.403
– ident: e_1_3_3_16_2
  doi: 10.1038/nature11632
– ident: e_1_3_3_9_2
  doi: 10.1016/j.ajhg.2009.01.005
– ident: e_1_3_3_15_2
  doi: 10.1038/nature06258
– ident: e_1_3_3_6_2
  doi: 10.1016/j.ajhg.2012.08.031
– ident: e_1_3_3_25_2
  doi: 10.1002/gepi.20064
– ident: e_1_3_3_28_2
  doi: 10.1371/journal.pgen.1002793
– ident: e_1_3_3_19_2
  doi: 10.1016/j.ajhg.2010.04.005
– ident: e_1_3_3_2_2
  doi: 10.1073/pnas.1221713110
– ident: e_1_3_3_30_2
  doi: 10.1038/ng.2507
– ident: e_1_3_3_4_2
  doi: 10.1161/CIRCGENETICS.113.000350
– ident: e_1_3_3_23_2
  doi: 10.1016/j.ajhg.2013.06.011
– ident: e_1_3_3_29_2
  doi: 10.1002/gepi.21603
– ident: e_1_3_3_22_2
  doi: 10.1086/500812
– ident: e_1_3_3_8_2
  doi: 10.1371/journal.pgen.1000529
– ident: e_1_3_3_18_2
  doi: 10.1371/journal.pgen.1000384
– ident: e_1_3_3_10_2
  doi: 10.1038/ng.2354
– ident: e_1_3_3_14_2
  doi: 10.1002/gepi.20608
– ident: e_1_3_3_20_2
  doi: 10.1016/j.ajhg.2011.07.015
SSID ssj0009580
Score 2.2291532
Snippet In the large cohorts that have been used for genome-wide association studies (GWAS), it is prohibitively expensive to sequence all cohort members. A...
High-throughput DNA sequencing provides an unprecedented opportunity to discover rare genetic variants associated with complex diseases and traits. However,...
SourceID unpaywall
pubmedcentral
proquest
pubmed
crossref
pnas
jstor
fao
SourceType Open Access Repository
Aggregation Database
Index Database
Enrichment Source
Publisher
StartPage 1019
SubjectTerms Biological Sciences
DNA Mutational Analysis - methods
Genetic Diseases, Inborn - genetics
genetic variation
Genetics
Genomics
Genotype
Genotype & phenotype
Genotypes
genotyping
Genotyping Techniques - methods
Health promotion
high-throughput nucleotide sequencing
Humans
Models, Genetic
Mutation
Oligonucleotide Array Sequence Analysis - methods
Physical Sciences
Sampling techniques
Simulation
Software
Title Integrative analysis of sequencing and array genotype data for discovering disease associations with rare mutations
URI https://www.jstor.org/stable/26454225
http://www.pnas.org/content/112/4/1019.abstract
https://www.ncbi.nlm.nih.gov/pubmed/25583502
https://www.proquest.com/docview/1650648565
https://www.proquest.com/docview/1652423602
https://www.proquest.com/docview/1803127028
https://pubmed.ncbi.nlm.nih.gov/PMC4313847
https://www.pnas.org/content/pnas/112/4/1019.full.pdf
UnpaywallVersion publishedVersion
Volume 112
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVFSB
  databaseName: Free Journals in Chemistry
  customDbUrl:
  eissn: 1091-6490
  dateEnd: 20250502
  omitProxy: true
  ssIdentifier: ssj0009580
  issn: 0027-8424
  databaseCode: HH5
  dateStart: 19150101
  isFulltext: true
  titleUrlDefault: http://abc-chemistry.org/
  providerName: ABC ChemistRy
– providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1091-6490
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009580
  issn: 0027-8424
  databaseCode: KQ8
  dateStart: 19150101
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVAFT
  databaseName: Open Access Digital Library
  customDbUrl:
  eissn: 1091-6490
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009580
  issn: 0027-8424
  databaseCode: KQ8
  dateStart: 19150115
  isFulltext: true
  titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html
  providerName: Colorado Alliance of Research Libraries
– providerCode: PRVBFR
  databaseName: Free Medical Journals
  customDbUrl:
  eissn: 1091-6490
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009580
  issn: 0027-8424
  databaseCode: DIK
  dateStart: 19150101
  isFulltext: true
  titleUrlDefault: http://www.freemedicaljournals.com
  providerName: Flying Publisher
– providerCode: PRVFQY
  databaseName: GFMER Free Medical Journals
  customDbUrl:
  eissn: 1091-6490
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0009580
  issn: 0027-8424
  databaseCode: GX1
  dateStart: 0
  isFulltext: true
  titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php
  providerName: Geneva Foundation for Medical Education and Research
– providerCode: PRVAQN
  databaseName: PubMed Central
  customDbUrl:
  eissn: 1091-6490
  dateEnd: 20250502
  omitProxy: true
  ssIdentifier: ssj0009580
  issn: 0027-8424
  databaseCode: RPM
  dateStart: 19150101
  isFulltext: true
  titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/
  providerName: National Library of Medicine
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB11twe4FAqUBkplJA7tIdnEcb6OFaIUJCokWLScIsexAbFkV5uNqvLrmUmcLMunek0mTuyMx8_2m2eAZ1obqXwVuGXChYv4H-NglPquSrlIlYklb0V93lzGF1PxehbNdiDqc2GIVrmsZN1u4hNbG0PvhC5MEBhMBHbyIPNoZdpblmYEu3GEEHwMu9PLt2cfOzoHvk10h9niWOjGIvN7SZ8kbIvC6EDzoDAI-NZoNDJy0dMSSesUTf-EO3-nT95qqqW8vpLz-U9j0_kd-NDXqqOkfPWadeGp778IPt642ndhz6JVdta51z7s6Ooe7Nt4ULMTK1p9eh_qV1Z3AqMnk1bphC0Ms1xtHCHxcsnkaiWvGSnD0uIvI4IqQ9zMKDuY2KRkZzeNmNy4Ts1owZjhzF6zb01HH6gfwPT8xfvnF6490MFVON9duypRZZQaiagjpERFkyqZFT6Oo1nKE16qMqSUP5wCIajMMpPJQpdFhL-w0FyKODyAcbWo9CEwxBXCBKXRCDiE5qWURgVGkeBArOLQOOD1PzZXVu2cDt2Y5-2uexLm1Lr5xhMcOBkeWHZCH383PURPyeUnDMP59B0nkT6aaaK7OXDQus9QBCfFNIyZDjxsSxmKDnguiF-XOXDUu1hu4we-LCYhwRTRtgNPh9vY82k7R1Z60bQ2BIZjn__DJsWgTSmHKX1A67WbT4sihN_0dLLlz4MBKY9v36m-fG4VyLEZQoQ1DpwOnv-_Rnt0A9vHcBsblUilLk-OYLxeNfoJYr91cQyjl7Pg2Pb3H602VYM
linkProvider Unpaywall
linkToUnpaywall http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwlV1Nb9QwEB212wNcgAKlgYKMxKE9JJs4jpMcK0RVkKiQYFE5RY5jA2LJrjYbofLrmUmcLMunek0mTuyMx8_2m2eAZ8ZYpUMd-VXKhY_4H-NgkoW-zrjItJWKd6I-ry_k-Uy8ukwudyAZcmGIVrmsVdNt4hNbG0PvlC5MERhMBXbyKA9oZTpYVnYX9mSCEHwCe7OLN6cfejoHvk30h9niWOhLkYeDpE8ad0VhdKB5UBxFfGs02rVqMdASSesUTf-EO3-nT95o66W6-qbm85_GprPb8H6oVU9J-RK06zLQ338RfLx2te_ALYdW2WnvXvuwY-q7sO_iQcOOnWj1yT1oXjrdCYyeTDmlE7awzHG1cYTEyxVTq5W6YqQMS4u_jAiqDHEzo-xgYpOSnds0YmrjOg2jBWOGM3vDvrY9faC5D7OzF--en_vuQAdf43x37etUV0lmFaKOmBIVbaZVXoY4juYZT3mlq5hS_nAKhKAyz22uSlOVCf7C0nAlZHwAk3pRm0NgiCuEjSprEHAIwyulrI6sJsEBqWVsPQiGH1top3ZOh27Mi27XPY0Lat1i4wkeHI8PLHuhj7-bHqKnFOojhuFi9paTSB_NNNHdPDjo3GcsgpNiGsZMDx50pYxFR7wQxK_LPTgaXKxw8QNfJklIMEO07cHT8Tb2fNrOUbVZtJ0NgWEZ8n_YZBi0KeUwow_ovHbzaUmC8JueTrf8eTQg5fHtO_XnT50COTZDjLDGg5PR8__XaA-vYfsIbmKjEqnU5-kRTNar1jxG7Lcun7ie_gOxzlSS
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Integrative+analysis+of+sequencing+and+array+genotype+data+for+discovering+disease+associations+with+rare+mutations&rft.jtitle=Proceedings+of+the+National+Academy+of+Sciences+-+PNAS&rft.au=Hu%2C+Yi-Juan&rft.au=Li%2C+Yun&rft.au=Auer%2C+Paul+L.&rft.au=Lin%2C+Dan-Yu&rft.date=2015-01-27&rft.issn=0027-8424&rft.eissn=1091-6490&rft.volume=112&rft.issue=4&rft.spage=1019&rft.epage=1024&rft_id=info:doi/10.1073%2Fpnas.1406143112&rft.externalDBID=n%2Fa&rft.externalDocID=10_1073_pnas_1406143112
thumbnail_m http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.pnas.org%2Fcontent%2F112%2F4.cover.gif
thumbnail_s http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.pnas.org%2Fcontent%2F112%2F4.cover.gif