Trans-ancestral rare variant association study with machine learning-based phenotyping for metabolic dysfunction-associated steatotic liver disease
Background Genome-wide association studies (GWAS) have identified common variants associated with metabolic dysfunction-associated steatotic liver disease (MASLD). However, rare coding variant studies have been limited by phenotyping challenges and small sample sizes. We test associations of rare an...
Saved in:
| Published in | Genome Biology Vol. 26; no. 1; p. 50 |
|---|---|
| Main Authors | , , , , , , |
| Format | Journal Article |
| Language | English |
| Published |
London
BioMed Central
10.03.2025
Springer Nature B.V BMC |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1474-760X 1474-7596 1474-760X |
| DOI | 10.1186/s13059-025-03518-5 |
Cover
| Summary: | Background
Genome-wide association studies (GWAS) have identified common variants associated with metabolic dysfunction-associated steatotic liver disease (MASLD). However, rare coding variant studies have been limited by phenotyping challenges and small sample sizes. We test associations of rare and ultra-rare coding variants with proton density fat fraction (PDFF) and MASLD case–control status in 736,010 participants of diverse ancestries from the UK Biobank, All of Us, and BioMe and performed a trans-ancestral meta-analysis. We then developed models to accurately predict PDFF and MASLD status in the UK Biobank and tested associations with these predicted phenotypes to increase statistical power.
Results
The trans-ancestral meta-analysis with PDFF and MASLD case–control status identifies two single variants and two gene-level associations in
APOB
,
CDH5
,
MYCBP2
, and
XAB2
. Association testing with predicted phenotypes, which replicates more known genetic variants from GWAS than true phenotypes, identifies 16 single variants and 11 gene-level associations implicating 23 additional genes. Two variants were polymorphic only among African ancestry participants and several associations showed significant heterogeneity in ancestry and sex-stratified analyses. In total, we identified 27 genes, of which 3 are monogenic causes of steatosis (
APOB
,
G6PC1
,
PPARG
), 4 were previously associated with MASLD (
APOB
,
APOC3
,
INSR
,
PPARG
), and 23 had supporting clinical, experimental, and/or genetic evidence.
Conclusions
Our results suggest that trans-ancestral association analyses can identify ancestry-specific rare and ultra-rare coding variants in MASLD pathogenesis. Furthermore, we demonstrate the utility of machine learning in genetic investigations of difficult-to-phenotype diseases in trans-ancestral biobanks. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 content type line 23 |
| ISSN: | 1474-760X 1474-7596 1474-760X |
| DOI: | 10.1186/s13059-025-03518-5 |