Emulating computer models with high-dimensional count output
Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run and need calibrating to real-world observations to be useful for decision-making. Emulators are often used as cheap surrogates...
Saved in:
Published in | Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences Vol. 383; no. 2292; p. 20240216 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
England
The Royal Society
13.03.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 1364-503X 1471-2962 1471-2962 |
DOI | 10.1098/rsta.2024.0216 |
Cover
Abstract | Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run and need calibrating to real-world observations to be useful for decision-making. Emulators are often used as cheap surrogates for the expensive simulator, trained on a small number of simulations to provide predictions with uncertainty at unseen inputs. In epidemiological applications, for example compartmental or agent-based models for modelling the spread of infectious diseases, the output is usually spatially and temporally indexed, stochastic and consists of counts rather than continuous variables. Here, we consider emulating high-dimensional count output from a complex computer model using a Poisson lognormal PCA (PLNPCA) emulator. We apply the PLNPCA emulator to output fields from a COVID-19 model for England and Wales and compare this to fitting emulators to aggregations of the full output. We show that performance is generally comparable, while the PLNPCA emulator inherits desirable properties, including allowing the full output to be predicted while capturing correlations between outputs, providing high-dimensional samples of counts that are representative of the true model output.
This article is part of the theme issue ‘Uncertainty quantification for healthcare and biological systems (Part 1)’. |
---|---|
AbstractList | Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run and need calibrating to real-world observations to be useful for decision-making. Emulators are often used as cheap surrogates for the expensive simulator, trained on a small number of simulations to provide predictions with uncertainty at unseen inputs. In epidemiological applications, for example compartmental or agent-based models for modelling the spread of infectious diseases, the output is usually spatially and temporally indexed, stochastic and consists of counts rather than continuous variables. Here, we consider emulating high-dimensional count output from a complex computer model using a Poisson lognormal PCA (PLNPCA) emulator. We apply the PLNPCA emulator to output fields from a COVID-19 model for England and Wales and compare this to fitting emulators to aggregations of the full output. We show that performance is generally comparable, while the PLNPCA emulator inherits desirable properties, including allowing the full output to be predicted while capturing correlations between outputs, providing high-dimensional samples of counts that are representative of the true model output.This article is part of the theme issue 'Uncertainty quantification for healthcare and biological systems (Part 1)'.Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run and need calibrating to real-world observations to be useful for decision-making. Emulators are often used as cheap surrogates for the expensive simulator, trained on a small number of simulations to provide predictions with uncertainty at unseen inputs. In epidemiological applications, for example compartmental or agent-based models for modelling the spread of infectious diseases, the output is usually spatially and temporally indexed, stochastic and consists of counts rather than continuous variables. Here, we consider emulating high-dimensional count output from a complex computer model using a Poisson lognormal PCA (PLNPCA) emulator. We apply the PLNPCA emulator to output fields from a COVID-19 model for England and Wales and compare this to fitting emulators to aggregations of the full output. We show that performance is generally comparable, while the PLNPCA emulator inherits desirable properties, including allowing the full output to be predicted while capturing correlations between outputs, providing high-dimensional samples of counts that are representative of the true model output.This article is part of the theme issue 'Uncertainty quantification for healthcare and biological systems (Part 1)'. Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run and need calibrating to real-world observations to be useful for decision-making. Emulators are often used as cheap surrogates for the expensive simulator, trained on a small number of simulations to provide predictions with uncertainty at unseen inputs. In epidemiological applications, for example compartmental or agent-based models for modelling the spread of infectious diseases, the output is usually spatially and temporally indexed, stochastic and consists of counts rather than continuous variables. Here, we consider emulating high-dimensional count output from a complex computer model using a Poisson lognormal PCA (PLNPCA) emulator. We apply the PLNPCA emulator to output fields from a COVID-19 model for England and Wales and compare this to fitting emulators to aggregations of the full output. We show that performance is generally comparable, while the PLNPCA emulator inherits desirable properties, including allowing the full output to be predicted while capturing correlations between outputs, providing high-dimensional samples of counts that are representative of the true model output.This article is part of the theme issue 'Uncertainty quantification for healthcare and biological systems (Part 1)'. Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be expensive to run and need calibrating to real-world observations to be useful for decision-making. Emulators are often used as cheap surrogates for the expensive simulator, trained on a small number of simulations to provide predictions with uncertainty at unseen inputs. In epidemiological applications, for example compartmental or agent-based models for modelling the spread of infectious diseases, the output is usually spatially and temporally indexed, stochastic and consists of counts rather than continuous variables. Here, we consider emulating high-dimensional count output from a complex computer model using a Poisson lognormal PCA (PLNPCA) emulator. We apply the PLNPCA emulator to output fields from a COVID-19 model for England and Wales and compare this to fitting emulators to aggregations of the full output. We show that performance is generally comparable, while the PLNPCA emulator inherits desirable properties, including allowing the full output to be predicted while capturing correlations between outputs, providing high-dimensional samples of counts that are representative of the true model output. This article is part of the theme issue ‘Uncertainty quantification for healthcare and biological systems (Part 1)’. |
Author | Xiong, Xiaoyu Salter, James M. McKinley, Trevelyan J. Williamson, Daniel B. |
Author_xml | – sequence: 1 givenname: James M. orcidid: 0000-0002-7428-6465 surname: Salter fullname: Salter, James M. – sequence: 2 givenname: Trevelyan J. surname: McKinley fullname: McKinley, Trevelyan J. – sequence: 3 givenname: Xiaoyu surname: Xiong fullname: Xiong, Xiaoyu – sequence: 4 givenname: Daniel B. surname: Williamson fullname: Williamson, Daniel B. |
BackLink | https://www.ncbi.nlm.nih.gov/pubmed/40078142$$D View this record in MEDLINE/PubMed |
BookMark | eNp1kUtLxDAUhYOMOA_dupQu3XTMa5IWBJFhfMCAGwV3IU3TaaRNxiZV_PemzCgquElyyXfOvdwzBSPrrAbgFME5gnl20fkg5xhiOocYsQMwQZSjFOcMj-KbMJouIHkeg6n3LxAixBb4CIwphDxDFE_A5artGxmM3STKtds-6C5pXakbn7ybUCe12dRpaVptvXFWNpHqbUhcHyJ7DA4r2Xh9sr9n4Olm9bi8S9cPt_fL63WqCFuEVOqCMMl4SQiHVEKJc4WVrqQqF7GmLFdal5BzFWcvYZEpXlQaEVUUNCNRPANXO99tX7S6VNqGTjZi25lWdh_CSSN-_1hTi417EwjlkDLEo8P53qFzr732QbTGK9000mrXe0EQZyzL4hnRs5_Nvrt87SwC8x2gOud9p6tvBEExhCKGUMQQihhCiQL6R6BMiDt3w7Cm-U_2CT6wk5s |
CitedBy_id | crossref_primary_10_1098_rsta_2024_0232 |
Cites_doi | 10.1615/Int.J.UncertaintyQuantification.2022039747 10.1214/18-AOAS1177 10.1093/biomet/76.4.643 10.1080/00401706.1999.10485594 10.1098/rsta.2022.0039 10.1214/21-STS822 10.5194/gmd-17-1059-2024 10.1137/17M1161233 10.1016/j.epidem.2022.100574 10.1214/16-AOAS934 10.5194/acp-13-8879-2013 10.1186/s12918-017-0484-3 10.1080/00401706.2013.860919 10.1080/10618600.2018.1458625 10.1016/j.jocs.2017.08.006 10.1198/TECH.2009.08019 10.1080/01621459.2015.1108199 10.3389/fevo.2021.588292 10.1098/rstb.2020.0272 10.1109/WSC.2008.4736089 10.1198/016214507000000888 10.1371/journal.pcbi.1003968 10.1002/9780470685853.ch10 10.18637/jss.v098.i13 10.1080/01621459.2018.1514306 10.1098/rsta.2020.0071 10.1007/s00382-013-1896-4 10.3389/fphys.2021.693015 10.1016/j.epidem.2009.11.002 10.1093/jrsssc/qlad083 10.1073/pnas.1000416107 |
ContentType | Journal Article |
Copyright | 2025 The Author(s). 2025 |
Copyright_xml | – notice: 2025 The Author(s). 2025 |
DBID | AAYXX CITATION NPM 7X8 5PM |
DOI | 10.1098/rsta.2024.0216 |
DatabaseName | CrossRef PubMed MEDLINE - Academic PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef PubMed MEDLINE - Academic |
DatabaseTitleList | MEDLINE - Academic PubMed CrossRef |
Database_xml | – sequence: 1 dbid: NPM name: PubMed url: https://proxy.k.utb.cz/login?url=http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=PubMed sourceTypes: Index Database |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Engineering Mathematics Sciences (General) Physics |
DocumentTitleAlternate | Emulating computer models with high-dimensional count output |
EISSN | 1471-2962 |
ExternalDocumentID | PMC11904617 40078142 10_1098_rsta_2024_0216 |
Genre | Journal Article |
GrantInformation_xml | – fundername: Engineering and Physical Sciences Research Council – fundername: ; |
GroupedDBID | --- -~X 0R~ 18M 4.4 5VS AACGO AANCE AAYXX ABFAN ABPLY ABTLG ABYWD ACGFO ACIWK ACMTB ACNCT ACQIA ACTMH ADBBV AFVYC ALMA_UNASSIGNED_HOLDINGS ALMYZ BTFSW CITATION DIK EBS F5P H13 HZ~ JLS JSG KQ8 MRS MV1 NSAHA O9- OK1 P2P RRY TN5 TR2 V1E XSW YNT ~02 NPM 7X8 5PM |
ID | FETCH-LOGICAL-c365t-aeb36a67d33704a0a29c2cefacd504a469ceed077c471d0b8c7bfe13cbb483eb3 |
ISSN | 1364-503X 1471-2962 |
IngestDate | Thu Aug 21 18:33:59 EDT 2025 Thu Jul 10 17:06:43 EDT 2025 Sun Mar 16 01:22:23 EDT 2025 Sun Jul 06 05:06:16 EDT 2025 Thu Apr 24 23:00:00 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 2292 |
Keywords | Poisson lognormal basis emulation uncertainty quantification Gaussian processes |
Language | English |
License | Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited. |
LinkModel | OpenURL |
MergedId | FETCHMERGED-LOGICAL-c365t-aeb36a67d33704a0a29c2cefacd504a469ceed077c471d0b8c7bfe13cbb483eb3 |
Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 One contribution of 13 to a theme issue ‘Uncertainty quantification for healthcare and biological systems (Part 1)’. Electronic supplementary material is available online at https://doi.org/10.6084/m9.figshare.c.7611332. |
ORCID | 0000-0002-7428-6465 |
OpenAccessLink | https://pubmed.ncbi.nlm.nih.gov/PMC11904617 |
PMID | 40078142 |
PQID | 3176688176 |
PQPubID | 23479 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_11904617 proquest_miscellaneous_3176688176 pubmed_primary_40078142 crossref_primary_10_1098_rsta_2024_0216 crossref_citationtrail_10_1098_rsta_2024_0216 |
PublicationCentury | 2000 |
PublicationDate | 2025-03-13 2025-Mar-13 20250313 |
PublicationDateYYYYMMDD | 2025-03-13 |
PublicationDate_xml | – month: 03 year: 2025 text: 2025-03-13 day: 13 |
PublicationDecade | 2020 |
PublicationPlace | England |
PublicationPlace_xml | – name: England |
PublicationTitle | Philosophical transactions of the Royal Society of London. Series A: Mathematical, physical, and engineering sciences |
PublicationTitleAlternate | Philos Trans A Math Phys Eng Sci |
PublicationYear | 2025 |
Publisher | The Royal Society |
Publisher_xml | – name: The Royal Society |
References | e_1_3_6_30_2 Rasmussen CE (e_1_3_6_19_2) 2006 e_1_3_6_31_2 e_1_3_6_32_2 e_1_3_6_10_2 e_1_3_6_14_2 e_1_3_6_37_2 e_1_3_6_13_2 e_1_3_6_12_2 e_1_3_6_11_2 e_1_3_6_18_2 e_1_3_6_33_2 e_1_3_6_17_2 e_1_3_6_34_2 e_1_3_6_16_2 e_1_3_6_35_2 e_1_3_6_15_2 e_1_3_6_36_2 e_1_3_6_20_2 e_1_3_6_21_2 e_1_3_6_5_2 e_1_3_6_4_2 e_1_3_6_3_2 e_1_3_6_2_2 e_1_3_6_9_2 e_1_3_6_8_2 e_1_3_6_7_2 e_1_3_6_6_2 Maddox WJ (e_1_3_6_28_2) 2021; 34 e_1_3_6_26_2 e_1_3_6_27_2 e_1_3_6_29_2 e_1_3_6_22_2 e_1_3_6_23_2 e_1_3_6_24_2 e_1_3_6_25_2 |
References_xml | – ident: e_1_3_6_29_2 doi: 10.1615/Int.J.UncertaintyQuantification.2022039747 – ident: e_1_3_6_18_2 doi: 10.1214/18-AOAS1177 – ident: e_1_3_6_32_2 doi: 10.1093/biomet/76.4.643 – ident: e_1_3_6_16_2 doi: 10.1080/00401706.1999.10485594 – ident: e_1_3_6_12_2 doi: 10.1098/rsta.2022.0039 – ident: e_1_3_6_37_2 – ident: e_1_3_6_21_2 doi: 10.1214/21-STS822 – ident: e_1_3_6_6_2 doi: 10.5194/gmd-17-1059-2024 – ident: e_1_3_6_11_2 doi: 10.1137/17M1161233 – ident: e_1_3_6_13_2 doi: 10.1016/j.epidem.2022.100574 – ident: e_1_3_6_27_2 doi: 10.1214/16-AOAS934 – ident: e_1_3_6_26_2 doi: 10.5194/acp-13-8879-2013 – ident: e_1_3_6_8_2 doi: 10.1186/s12918-017-0484-3 – ident: e_1_3_6_23_2 doi: 10.1080/00401706.2013.860919 – ident: e_1_3_6_24_2 doi: 10.1080/10618600.2018.1458625 – ident: e_1_3_6_7_2 doi: 10.1016/j.jocs.2017.08.006 – ident: e_1_3_6_20_2 doi: 10.1198/TECH.2009.08019 – ident: e_1_3_6_30_2 doi: 10.1080/01621459.2015.1108199 – ident: e_1_3_6_31_2 doi: 10.3389/fevo.2021.588292 – ident: e_1_3_6_35_2 doi: 10.1098/rstb.2020.0272 – ident: e_1_3_6_22_2 doi: 10.1109/WSC.2008.4736089 – ident: e_1_3_6_36_2 – ident: e_1_3_6_14_2 doi: 10.1198/016214507000000888 – ident: e_1_3_6_10_2 doi: 10.1371/journal.pcbi.1003968 – volume: 34 start-page: 19274 year: 2021 ident: e_1_3_6_28_2 article-title: Bayesian Optimization with High-Dimensional Outputs publication-title: Adv. Neural Inf. Process. Syst. – ident: e_1_3_6_15_2 doi: 10.1002/9780470685853.ch10 – ident: e_1_3_6_25_2 doi: 10.18637/jss.v098.i13 – ident: e_1_3_6_5_2 doi: 10.1080/01621459.2018.1514306 – ident: e_1_3_6_3_2 doi: 10.1098/rsta.2020.0071 – ident: e_1_3_6_4_2 doi: 10.1007/s00382-013-1896-4 – volume-title: Gaussian processes for machine learning year: 2006 ident: e_1_3_6_19_2 – ident: e_1_3_6_9_2 doi: 10.3389/fphys.2021.693015 – ident: e_1_3_6_33_2 doi: 10.1016/j.epidem.2009.11.002 – ident: e_1_3_6_2_2 – ident: e_1_3_6_17_2 doi: 10.1093/jrsssc/qlad083 – ident: e_1_3_6_34_2 doi: 10.1073/pnas.1000416107 |
SSID | ssj0011652 |
Score | 2.461443 |
Snippet | Computer models are used to study the real world, and often contain a large number of uncertain input parameters, produce a large number of outputs, may be... |
SourceID | pubmedcentral proquest pubmed crossref |
SourceType | Open Access Repository Aggregation Database Index Database Enrichment Source |
StartPage | 20240216 |
Title | Emulating computer models with high-dimensional count output |
URI | https://www.ncbi.nlm.nih.gov/pubmed/40078142 https://www.proquest.com/docview/3176688176 https://pubmed.ncbi.nlm.nih.gov/PMC11904617 |
Volume | 383 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3di9QwEA_rieA9iHd-3PpFBUGltHaT9At8UTk5lBXFPdi30qQ5XDi74rYP59_nH-ZMkqbt3gqnLyVtk3Z3ZzYzmcz8foQ8k1meZDFNA1nN4oCX0BIMBCIQWyRmKaUKi5Pnn5KTU_5hGS8nk9-DrKW2EaH8tbOu5H-kCtdArlgl-w-SdQ-FC9AG-cIRJAzHK8n4-Lsm3zJls5qcwTDb2JI1hCIOKoTvN9AbviaG8NdtA32HXunnjs9AS6zpKcQ3XQqBiTJ0KZ64r6BpQEKca2Ct7esw_dxBwGoWARs1MW2Mz6se-9C3ltd59F_Ntn2Xt-vPwz5i-HFV28j6AgGnzi-w5MrdX65sVvFyVa4v2u1AkskqMJX0_ttwGOWgMaZ5mSJVOzGzhAdxpLmDwW6Za2BYA5qPZ3OWsYHaUmqY9i4ZiijH4gcsHgrhdTwEV2cHIveWpXT5i2bnPitwfIHjCxx_jVynKXhwmBrwpd_LmiWa98l9Awcdmr0av3_sGl1a72yn7Q78oMVtcssuYLw3RhsPyETVh2R_AGsJZ70ibA7JDZ1kjK0Da0g23guLdv7yDnntlNjrlNgzSuyhEnvbSuxpJfaMEt8lp--PF-9OAkvpEUiWxE1QKsGSMkkrxtKIl1FJc0mlOitlFcM5T3J02qI0lSDbKhKZTMWZmjEpBM8YDL5H9up1rY6IJynjYBylYirilYpEycC1Z0qmkuUVK6ck6H7NQlq8e6RdOS92S29Knrv-PwzSy197Pu2EU8BkjDtsZa3W7aZgCLeaZXCckvtGWO5ZHL3xGadTko3E6Dog0Pv4Tr36pgHfZ-C1c1hqPLjyR3xIbvb_okdkr_nZqsfgPTfiiVbOP-TRyh8 |
linkProvider | Colorado Alliance of Research Libraries |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Emulating+computer+models+with+high-dimensional+count+output&rft.jtitle=Philosophical+transactions+of+the+Royal+Society+of+London.+Series+A%3A+Mathematical%2C+physical%2C+and+engineering+sciences&rft.au=Salter%2C+James+M.&rft.au=McKinley%2C+Trevelyan+J.&rft.au=Xiong%2C+Xiaoyu&rft.au=Williamson%2C+Daniel+B.&rft.date=2025-03-13&rft.issn=1364-503X&rft.eissn=1471-2962&rft.volume=383&rft.issue=2292&rft_id=info:doi/10.1098%2Frsta.2024.0216&rft.externalDBID=n%2Fa&rft.externalDocID=10_1098_rsta_2024_0216 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1364-503X&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1364-503X&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1364-503X&client=summon |