Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables
Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants...
Saved in:
Published in | ArXiv.org |
---|---|
Main Authors | , , , , |
Format | Journal Article |
Language | English |
Published |
United States
Cornell University
20.09.2024
|
Online Access | Get full text |
ISSN | 2331-8422 2331-8422 |
Cover
Abstract | Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for
and
in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with
-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem. |
---|---|
AbstractList | Multivariate Mendelian randomization (MVMR) is a statistical technique that
uses sets of genetic instruments to estimate the direct causal effects of
multiple exposures on an outcome of interest. At genomic loci with pleiotropic
gene regulatory effects, that is, loci where the same genetic variants are
associated to multiple nearby genes, MVMR can potentially be used to predict
candidate causal genes. However, consensus in the field dictates that the
genetic instruments in MVMR must be independent, which is usually not possible
when considering a group of candidate genes from the same locus.
We used causal inference theory to show that MVMR with correlated instruments
satisfies the instrumental set condition. This is a classical result by Brito
and Pearl (2002) for structural equation models that guarantees the
identifiability of causal effects in situations where multiple exposures
collectively, but not individually, separate a set of instrumental variables
from an outcome variable. Extensive simulations confirmed the validity and
usefulness of these theoretical results even at modest sample sizes.
Importantly, the causal effect estimates remain unbiased and their variance
small when instruments are highly correlated.
We applied MVMR with correlated instrumental variable sets at risk loci from
genome-wide association studies (GWAS) for coronary artery disease using eQTL
data from the STARNET study. Our method predicts causal genes at twelve loci,
each associated with multiple colocated genes in multiple tissues. However, the
extensive degree of regulatory pleiotropy across tissues and the limited number
of causal variants in each locus still require that MVMR is run on a
tissue-by-tissue basis, and testing all gene-tissue pairs at a given locus in a
single model to predict causal gene-tissue combinations remains infeasible. Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR 1 and ADAMTS 7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem.Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for PHACTR 1 and ADAMTS 7 in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with cis-eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem. Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple exposures on an outcome of interest. At genomic loci with pleiotropic gene regulatory effects, that is, loci where the same genetic variants are associated to multiple nearby genes, MVMR can potentially be used to predict candidate causal genes. However, consensus in the field dictates that the genetic instruments in MVMR must be independent (not in linkage disequilibrium, which is usually not possible when considering a group of candidate genes from the same locus. Here we used causal inference theory to show that MVMR with correlated instruments satisfies the instrumental set condition. This is a classical result by Brito and Pearl (2002) for structural equation models that guarantees the identifiability of individual causal effects in situations where multiple exposures collectively, but not individually, separate a set of instrumental variables from an outcome variable. Extensive simulations confirmed the validity and usefulness of these theoretical results. Importantly, the causal effect estimates remained unbiased and their variance small even when instruments are highly correlated, while bias introduced by horizontal pleiotropy or LD matrix sampling error was comparable to standard MR. We applied MVMR with correlated instrumental variable sets at genome-wide significant loci for coronary artery disease (CAD) risk using expression Quantitative Trait Loci (eQTL) data from seven vascular and metabolic tissues in the STARNET study. Our method predicts causal genes at twelve loci, each associated with multiple colocated genes in multiple tissues. We confirm causal roles for and in arterial tissues, among others. However, the extensive degree of regulatory pleiotropy across tissues and the limited number of causal variants in each locus still require that MVMR is run on a tissue-by-tissue basis, and testing all gene-tissue pairs with -eQTL associations at a given locus in a single model to predict causal gene-tissue combinations remains infeasible. Our results show that within tissues, MVMR with dependent, as opposed to independent, sets of instrumental variables significantly expands the scope for predicting causal genes in disease risk loci with pleiotropic regulatory effects. However, considering risk loci with regulatory pleiotropy that also spans across tissues remains an unsolved problem. |
Author | Bankier, Sean Ludl, Adriaan-Alexander Björkegren, Johan Lm Khan, Mariyam Michoel, Tom |
Author_xml | – sequence: 1 givenname: Mariyam surname: Khan fullname: Khan, Mariyam organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway – sequence: 2 givenname: Adriaan-Alexander surname: Ludl fullname: Ludl, Adriaan-Alexander organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway – sequence: 3 givenname: Sean surname: Bankier fullname: Bankier, Sean organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway – sequence: 4 givenname: Johan Lm surname: Björkegren fullname: Björkegren, Johan Lm organization: Department of Genetics & Genomic Sciences/Institute of Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, NY, USA – sequence: 5 givenname: Tom surname: Michoel fullname: Michoel, Tom organization: Computational Biology Unit, Department of Informatics, University of Bergen, Bergen, Norway |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/38259344$$D View this record in MEDLINE/PubMed |
BookMark | eNpVkF9LwzAUxYtM3Jz7CpJHXwr5szbtk4yhUxgoqPhY0vRmi2RJTdLJ3vzodjplPt0D9_A7957zZGCdhZNkRBkjaTGldHCkh8kkhDeMMc05zTJ2lgxZQbOSTaej5PPRQ6Nl1M4ip5AUXRAGrcBCQCKixevsCRknNfrQcY1aA9pF71otvz3Iw6ozIjq_Q6AUyBhQF7RdoQC93AOd99A7oEHahui7DdjYJ2yF16I2EC6SUyVMgMlhjpOX25vn-V26fFjcz2fLtCWE8pRgBiXUhcA5sLpuCOz_41zVhEteZnXOhVI5hqwuGy5pOc04LpjAqiFCUcnGyfUPt-3qDTSyP8MLU7Veb4TfVU7o6v_G6nW1ctuK4KJvruA94epA8O69gxCrjQ4SjBEWXBcqWhJe5CxnuLdeHof9pfz2zr4AT4GGIA |
ContentType | Journal Article |
DBID | NPM 7X8 5PM |
DatabaseName | PubMed MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | PubMed MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic PubMed |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Physics |
EISSN | 2331-8422 |
ExternalDocumentID | PMC10802687 38259344 |
Genre | Journal Article Preprint |
GrantInformation_xml | – fundername: NHLBI NIH HHS grantid: R01 HL168174 – fundername: NHLBI NIH HHS grantid: R01 HL164577 – fundername: NHLBI NIH HHS grantid: R01 HL148167 – fundername: NHLBI NIH HHS grantid: R01 HL166428 – fundername: NHLBI NIH HHS grantid: R01 HL148239 |
GroupedDBID | ABJCF AFKRA ALMA_UNASSIGNED_HOLDINGS BENPR BGLVJ FRJ HCIFZ M7S M~E NPM PIMPY PTHSS 7X8 CCPQU PHGZM PHGZT PQGLB PUEGO 5PM |
ID | FETCH-LOGICAL-p1127-103e9eb8a06e3bbd1e233177fb17c795b67aff60e5b9d7c29457083a0fd1af2c3 |
ISSN | 2331-8422 |
IngestDate | Thu Aug 21 18:36:19 EDT 2025 Mon Sep 08 10:22:55 EDT 2025 Wed Feb 19 02:03:59 EST 2025 |
IsOpenAccess | true |
IsPeerReviewed | false |
IsScholarly | false |
Language | English |
License | This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which allows reusers to copy and distribute the material in any medium or format in unadapted form only, for noncommercial purposes only, and only so long as attribution is given to the creator. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-p1127-103e9eb8a06e3bbd1e233177fb17c795b67aff60e5b9d7c29457083a0fd1af2c3 |
Notes | ObjectType-Working Paper/Pre-Print-3 ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
OpenAccessLink | https://pubmed.ncbi.nlm.nih.gov/PMC10802687 |
PMID | 38259344 |
PQID | 2917863630 |
PQPubID | 23479 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_10802687 proquest_miscellaneous_2917863630 pubmed_primary_38259344 |
PublicationCentury | 2000 |
PublicationDate | 2024-Sep-20 |
PublicationDateYYYYMMDD | 2024-09-20 |
PublicationDate_xml | – month: 09 year: 2024 text: 2024-Sep-20 day: 20 |
PublicationDecade | 2020 |
PublicationPlace | United States |
PublicationPlace_xml | – name: United States |
PublicationTitle | ArXiv.org |
PublicationTitleAlternate | ArXiv |
PublicationYear | 2024 |
Publisher | Cornell University |
Publisher_xml | – name: Cornell University |
References | 39527631 - PLoS Genet. 2024 Nov 11;20(11):e1011473. doi: 10.1371/journal.pgen.1011473 |
References_xml | – reference: 39527631 - PLoS Genet. 2024 Nov 11;20(11):e1011473. doi: 10.1371/journal.pgen.1011473 |
SSID | ssj0002672553 |
Score | 1.8884465 |
SecondaryResourceType | preprint |
Snippet | Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple... Multivariate Mendelian randomization (MVMR) is a statistical technique that uses sets of genetic instruments to estimate the direct causal effects of multiple... |
SourceID | pubmedcentral proquest pubmed |
SourceType | Open Access Repository Aggregation Database Index Database |
Title | Prediction of causal genes at GWAS loci with pleiotropic gene regulatory effects using sets of correlated instrumental variables |
URI | https://www.ncbi.nlm.nih.gov/pubmed/38259344 https://www.proquest.com/docview/2917863630 https://pubmed.ncbi.nlm.nih.gov/PMC10802687 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnZ3Nb9MwFMAtNoTEBfFN-ZiMxG0KSmI3To5dNZiQOlWwid4q27XXbCyp0nYSHBB_Ou_Z-dq0A-wSVY7rw_s59vPz-yDkQyZ5YiObBHaYhgFXsQmkiCx8V4aZ1JgsMxicPDlOjk75l9lw1lXpdNElG_VR_7o1ruQuVKENuGKU7H-QbQeFBvgNfOEJhOH5T4ynFV6zNDqflts1CPwMVy-MUfz8ffRtH_aq3BtbVz9MXm6qcpVr12e_8mXo8ZK98erYOsvB2ngHD42VO6CHwfxMmGi2LgRwBedrjLha9zXbUTXLr9Apol3Cl962OoHeP-Vl6_mzXXi_gAUM0k3NA1lc5HURMNNrPi-rC3NW1SEk5bJ-VdspYo5OFXHYzqxxWaHfTs_fxO9AbqmLGYuClPsI5R641aUjx-AYmzGfJ_JGduzpZIxeknGSih2ywyJc6ia_OzNbnAg4NGG1pGaQ244SNz1ieyrGyWPyqD4b0JEH_YTcM8VT8sD56Or1M_Knw01LSz1u6nBTuaGImyJuirhpD7frQzvctMZNHW6KuN2ALW7ax01b3M_J6afDk_FRUBfQCFagRgvYYpnJjEplmBim1CIyKGghrIqEFtlQJUJam4RmqLKF0HHGhwJUchnaRSRtrNkLsluUhXlFKA-14jaCMbjkkjFlBY9BwdJKhVILPSDvG6nOYYHCWydZmHK7nsdZJNKEJSwckJdeyvOVz6Qyb5gMSHpN_m0HTH5-_U2RL10S9Ib667v_9Q152E3Ut2QXBGvegYq5UXvk_sHh8fTrnptMfwGhxJAa |
linkProvider | ISSN International Centre |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Prediction+of+causal+genes+at+GWAS+loci+with+pleiotropic+gene+regulatory+effects+using+sets+of+correlated+instrumental+variables&rft.jtitle=ArXiv.org&rft.au=Khan%2C+Mariyam&rft.au=Ludl%2C+Adriaan&rft.au=Bankier%2C+Sean&rft.au=Bjorkegren%2C+Johan&rft.date=2024-09-20&rft.pub=Cornell+University&rft.eissn=2331-8422&rft_id=info%3Apmid%2F38259344&rft.externalDocID=PMC10802687 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=2331-8422&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=2331-8422&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=2331-8422&client=summon |