Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables

Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants...

Full description

Saved in:
Bibliographic Details
Published inArXiv.org
Main Authors Khan, Mariyam, Ludl, Adriaan-Alexander, Bankier, Sean, Björkegren, Johan Lm, Michoel, Tom
Format Journal Article
LanguageEnglish
Published United States Cornell University 20.09.2024
Online AccessGet full text
ISSN2331-8422
2331-8422

Cover

Abstract Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for and in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with -eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.
AbstractList Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent, which is usually not possible when considering a group of candidate genes from the same locus. We used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results even at modest sample sizes. Importantly, the causal effect estimates remain unbiased and their variance small when instruments are highly correlated. We applied MVMR with correlated instrumental variable sets at risk loci from genome-wide association studies (GWAS) for coronary artery disease using eQTL data from the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs at a given locus in a single model to predict causal gene-tissue combinations remains infeasible.
Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR 1 and ADAMTS 7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR 1 and ADAMTS 7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.
Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for and in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with -eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.
Author Bankier, Sean
Ludl, Adriaan-Alexander
Björkegren, Johan Lm
Khan, Mariyam
Michoel, Tom
Author_xml – sequence: 1
  givenname: Mariyam
  surname: Khan
  fullname: Khan, Mariyam
  organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
– sequence: 2
  givenname: Adriaan-Alexander
  surname: Ludl
  fullname: Ludl, Adriaan-Alexander
  organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
– sequence: 3
  givenname: Sean
  surname: Bankier
  fullname: Bankier, Sean
  organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
– sequence: 4
  givenname: Johan Lm
  surname: Björkegren
  fullname: Björkegren, Johan Lm
  organization: Department of Genetics & Genomic Sciences/Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA
– sequence: 5
  givenname: Tom
  surname: Michoel
  fullname: Michoel, Tom
  organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway
BackLink https://www.ncbi.nlm.nih.gov/pubmed/38259344$$D View this record in MEDLINE/PubMed
BookMark eNpVkF9LwzAUxYtM3Jz7CpJHXwr5szbtk4yhUxgoqPhY0vRmi2RJTdLJ3vzodjplPt0D9_A7957zZGCdhZNkRBkjaTGldHCkh8kkhDeMMc05zTJ2lgxZQbOSTaej5PPRQ6Nl1M4ip5AUXRAGrcBCQCKixevsCRknNfrQcY1aA9pF71otvz3Iw6ozIjq_Q6AUyBhQF7RdoQC93AOd99A7oEHahui7DdjYJ2yF16I2EC6SUyVMgMlhjpOX25vn-V26fFjcz2fLtCWE8pRgBiXUhcA5sLpuCOz_41zVhEteZnXOhVI5hqwuGy5pOc04LpjAqiFCUcnGyfUPt-3qDTSyP8MLU7Veb4TfVU7o6v_G6nW1ctuK4KJvruA94epA8O69gxCrjQ4SjBEWXBcqWhJe5CxnuLdeHof9pfz2zr4AT4GGIA
ContentType Journal Article
DBID NPM
7X8
5PM
DatabaseName PubMed
MEDLINE - Academic
PubMed Central (Full Participant titles)
DatabaseTitle PubMed
MEDLINE - Academic
DatabaseTitleList
MEDLINE - Academic
PubMed
Database_xml – sequence: 1
  dbid: NPM
  name: PubMed
  url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed
  sourceTypes: Index Database
DeliveryMethod fulltext_linktorsrc
Discipline Physics
EISSN 2331-8422
ExternalDocumentID PMC10802687
38259344
Genre Journal Article
Preprint
GrantInformation_xml – fundername: NHLBI NIH HHS
  grantid: R01 HL168174
– fundername: NHLBI NIH HHS
  grantid: R01 HL164577
– fundername: NHLBI NIH HHS
  grantid: R01 HL148167
– fundername: NHLBI NIH HHS
  grantid: R01 HL166428
– fundername: NHLBI NIH HHS
  grantid: R01 HL148239
GroupedDBID ABJCF
AFKRA
ALMA_UNASSIGNED_HOLDINGS
BENPR
BGLVJ
FRJ
HCIFZ
M7S
M~E
NPM
PIMPY
PTHSS
7X8
CCPQU
PHGZM
PHGZT
PQGLB
PUEGO
5PM
ID FETCH-LOGICAL-p1127-103e9eb8a06e3bbd1e233177fb17c795b67aff60e5b9d7c29457083a0fd1af2c3
ISSN 2331-8422
IngestDate Thu Aug 21 18:36:19 EDT 2025
Mon Sep 08 10:22:55 EDT 2025
Wed Feb 19 02:03:59 EST 2025
IsOpenAccess true
IsPeerReviewed false
IsScholarly false
Language English
License This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator.
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-p1127-103e9eb8a06e3bbd1e233177fb17c795b67aff60e5b9d7c29457083a0fd1af2c3
Notes ObjectType-Working Paper/Pre-Print-3
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
OpenAccessLink https://pubmed.ncbi.nlm.nih.gov/PMC10802687
PMID 38259344
PQID 2917863630
PQPubID 23479
ParticipantIDs pubmedcentral_primary_oai_pubmedcentral_nih_gov_10802687
proquest_miscellaneous_2917863630
pubmed_primary_38259344
PublicationCentury 2000
PublicationDate 2024-Sep-20
PublicationDateYYYYMMDD 2024-09-20
PublicationDate_xml – month: 09
  year: 2024
  text: 2024-Sep-20
  day: 20
PublicationDecade 2020
PublicationPlace United States
PublicationPlace_xml – name: United States
PublicationTitle ArXiv.org
PublicationTitleAlternate ArXiv
PublicationYear 2024
Publisher Cornell University
Publisher_xml – name: Cornell University
References 39527631 - PLoS Genet. 2024 Nov 11;20(11):e1011473. doi: 10.1371/journal.pgen.1011473
References_xml – reference: 39527631 - PLoS Genet. 2024 Nov 11;20(11):e1011473. doi: 10.1371/journal.pgen.1011473
SSID ssj0002672553
Score 1.8884465
SecondaryResourceType preprint
Snippet Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple...
Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple...
SourceID pubmedcentral
proquest
pubmed
SourceType Open Access Repository
Aggregation Database
Index Database
Title Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables
URI https://www.ncbi.nlm.nih.gov/pubmed/38259344
https://www.proquest.com/docview/2917863630
https://pubmed.ncbi.nlm.nih.gov/PMC10802687
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3Nb9MwFMAtNoTEBfFN-ZiMxG0KSmI3To5dNZiQOlWwid4q27XXbCyp0nYSHBB_Ou_Z-dq0A-wSVY7rw_s59vPz-yDkQyZ5YiObBHaYhgFXsQmkiCx8V4aZ1JgsMxicPDlOjk75l9lw1lXpdNElG_VR_7o1ruQuVKENuGKU7H-QbQeFBvgNfOEJhOH5T4ynFV6zNDqflts1CPwMVy-MUfz8ffRtH_aq3BtbVz9MXm6qcpVr12e_8mXo8ZK98erYOsvB2ngHD42VO6CHwfxMmGi2LgRwBedrjLha9zXbUTXLr9Apol3Cl962OoHeP-Vl6_mzXXi_gAUM0k3NA1lc5HURMNNrPi-rC3NW1SEk5bJ-VdspYo5OFXHYzqxxWaHfTs_fxO9AbqmLGYuClPsI5R641aUjx-AYmzGfJ_JGduzpZIxeknGSih2ywyJc6ia_OzNbnAg4NGG1pGaQ244SNz1ieyrGyWPyqD4b0JEH_YTcM8VT8sD56Or1M_Knw01LSz1u6nBTuaGImyJuirhpD7frQzvctMZNHW6KuN2ALW7ax01b3M_J6afDk_FRUBfQCFagRgvYYpnJjEplmBim1CIyKGghrIqEFtlQJUJam4RmqLKF0HHGhwJUchnaRSRtrNkLsluUhXlFKA-14jaCMbjkkjFlBY9BwdJKhVILPSDvG6nOYYHCWydZmHK7nsdZJNKEJSwckJdeyvOVz6Qyb5gMSHpN_m0HTH5-_U2RL10S9Ib667v_9Q152E3Ut2QXBGvegYq5UXvk_sHh8fTrnptMfwGhxJAa
linkProvider ISSN International Centre
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Prediction+of+causal+genes+at+GWAS+loci+with+pleiotropic+gene+regulatory+effects+using+sets+of+correlated+instrumental+variables&rft.jtitle=ArXiv.org&rft.au=Khan%2C+Mariyam&rft.au=Ludl%2C+Adriaan&rft.au=Bankier%2C+Sean&rft.au=Bjorkegren%2C+Johan&rft.date=2024-09-20&rft.pub=Cornell+University&rft.eissn=2331-8422&rft_id=info%3Apmid%2F38259344&rft.externalDocID=PMC10802687
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2331-8422&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2331-8422&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2331-8422&client=summon