Risk Prediction Modeling on Family-Based Sequencing Data Using a Random Field Method

Family-based design is one of the most popular designs in genetic studies and has many unique features for risk-prediction research. It is robust against genetic heterogeneity, and the relatedness among family members can be informative for predicting an individual’s risk for disease with polygenic...

Full description

Saved in:
Bibliographic Details
Published inGenetics (Austin) Vol. 207; no. 1; pp. 63 - 73
Main Authors Wen, Yalu, Burt, Alexandra, Lu, Qing
Format Journal Article
LanguageEnglish
Published United States Genetics Society of America 01.09.2017
Subjects
Online AccessGet full text
ISSN1943-2631
0016-6731
1943-2631
DOI10.1534/genetics.117.199752

Cover

More Information
Summary:Family-based design is one of the most popular designs in genetic studies and has many unique features for risk-prediction research. It is robust against genetic heterogeneity, and the relatedness among family members can be informative for predicting an individual’s risk for disease with polygenic and shared environmental components of risk. Despite these strengths, family-based designs have been used infrequently in current risk-prediction studies, and their related statistical methods have not been well developed. In this article, we developed a generalized random field (GRF) method for family-based risk-prediction modeling on sequencing data. In GRF, subjects’ phenotypes are viewed as stochastic realizations of a random field in a space, and a subject’s phenotype is predicted by adjacent subjects, where adjacencies between subjects are determined by their genetic and within-family similarities. Different from existing methods that adjust for familial correlations, the GRF uses this information to form surrogates to further improve prediction accuracy. It also uses within-family information to capture predictors (e.g., rare mutations) that are homogeneous in families. Through simulations, we have demonstrated that the GRF method attained better performance than an existing method by considering additional information from family members and accounting for genetic heterogeneity. We further provided practical recommendations for designing family-based risk prediction studies. Finally, we illustrated the GRF method with an application to a whole-genome exome data set from the Michigan State University Twin Registry study.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1943-2631
0016-6731
1943-2631
DOI:10.1534/genetics.117.199752