Trans-ancestral rare variant association study with machine learning-based phenotyping for metabolic dysfunction-associated steatotic liver disease

Background Genome-wide association studies (GWAS) have identified common variants associated with metabolic dysfunction-associated steatotic liver disease (MASLD). However, rare coding variant studies have been limited by phenotyping challenges and small sample sizes. We test associations of rare an...

Full description

Saved in:
Bibliographic Details
Published inGenome Biology Vol. 26; no. 1; p. 50
Main Authors Chen, Robert, Petrazzini, Ben Omega, Duffy, Áine, Rocheleau, Ghislain, Jordan, Daniel, Bansal, Meena, Do, Ron
Format Journal Article
LanguageEnglish
Published London BioMed Central 10.03.2025
Springer Nature B.V
BMC
Subjects
Online AccessGet full text
ISSN1474-760X
1474-7596
1474-760X
DOI10.1186/s13059-025-03518-5

Cover

More Information
Summary:Background Genome-wide association studies (GWAS) have identified common variants associated with metabolic dysfunction-associated steatotic liver disease (MASLD). However, rare coding variant studies have been limited by phenotyping challenges and small sample sizes. We test associations of rare and ultra-rare coding variants with proton density fat fraction (PDFF) and MASLD case–control status in 736,010 participants of diverse ancestries from the UK Biobank, All of Us, and BioMe and performed a trans-ancestral meta-analysis. We then developed models to accurately predict PDFF and MASLD status in the UK Biobank and tested associations with these predicted phenotypes to increase statistical power. Results The trans-ancestral meta-analysis with PDFF and MASLD case–control status identifies two single variants and two gene-level associations in APOB , CDH5 , MYCBP2 , and XAB2 . Association testing with predicted phenotypes, which replicates more known genetic variants from GWAS than true phenotypes, identifies 16 single variants and 11 gene-level associations implicating 23 additional genes. Two variants were polymorphic only among African ancestry participants and several associations showed significant heterogeneity in ancestry and sex-stratified analyses. In total, we identified 27 genes, of which 3 are monogenic causes of steatosis ( APOB , G6PC1 , PPARG ), 4 were previously associated with MASLD ( APOB , APOC3 , INSR , PPARG ), and 23 had supporting clinical, experimental, and/or genetic evidence. Conclusions Our results suggest that trans-ancestral association analyses can identify ancestry-specific rare and ultra-rare coding variants in MASLD pathogenesis. Furthermore, we demonstrate the utility of machine learning in genetic investigations of difficult-to-phenotype diseases in trans-ancestral biobanks.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
content type line 23
ISSN:1474-760X
1474-7596
1474-760X
DOI:10.1186/s13059-025-03518-5