Correcting for volunteer bias in GWAS increases SNP effect sizes and heritability estimates
Selection bias in genome-wide association studies (GWASs) due to volunteer-based sampling (volunteer bias) is poorly understood. The UK Biobank (UKB), one of the largest and most widely used cohorts, is highly selected. Using inverse probability (IP) weights we estimate inverse probability weighted...
Saved in:
Published in | Nature communications Vol. 16; no. 1; pp. 3578 - 11 |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
London
Nature Publishing Group UK
15.04.2025
Nature Publishing Group Nature Portfolio |
Subjects | |
Online Access | Get full text |
ISSN | 2041-1723 2041-1723 |
DOI | 10.1038/s41467-025-58684-8 |
Cover
Summary: | Selection bias in genome-wide association studies (GWASs) due to volunteer-based sampling (volunteer bias) is poorly understood. The UK Biobank (UKB), one of the largest and most widely used cohorts, is highly selected. Using inverse probability (IP) weights we estimate inverse probability weighted GWAS (WGWAS) to correct GWAS summary statistics in the UKB for volunteer bias. Our IP weights were estimated using UK Census data – the largest source of population-representative data – made representative of the UKB’s sampling population. These weights have a substantial SNP-based heritability of 4.8% (s.e. 0.8%), suggesting they capture volunteer bias in GWAS. Across ten phenotypes, WGWAS yields larger SNP effect sizes, larger heritability estimates, and altered gene-set tissue expression, despite decreasing the effective sample size by 62% on average, compared to GWAS. The impact of volunteer bias on GWAS results varies by phenotype. Traits related to disease, health behaviors, and socioeconomic status are most affected. We recommend that GWAS consortia provide population weights for their data sets, or use population-representative samples.
Genetic studies may be biased due to volunteer-based biobanks. Using UK Biobank, the authors apply inverse probability weighting based on UK Census data, finding that genome-wide association studies showed bias in SNP effect sizes, heritability, and gene-set tissue expression. |
---|---|
Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
ISSN: | 2041-1723 2041-1723 |
DOI: | 10.1038/s41467-025-58684-8 |