Data Science, Analytics and Collaboration for a Biosurveillance Ecosystem
ObjectiveWhile there is a growing torrent of data that disease surveillance could leverage, few effective tools exist to help public health professionals make sense of this data or that provide secure work-sharing and communication. Meanwhile, our ever more-connected world provides an increasingly r...
Saved in:
Published in | Online journal of public health informatics Vol. 11; no. 1 |
---|---|
Main Authors | , , , , , , , |
Format | Journal Article |
Language | English |
Published |
University of Illinois at Chicago Library
30.05.2019
|
Subjects | |
Online Access | Get full text |
ISSN | 1947-2579 1947-2579 |
DOI | 10.5210/ojphi.v11i1.9702 |
Cover
Abstract | ObjectiveWhile there is a growing torrent of data that disease surveillance could leverage, few effective tools exist to help public health professionals make sense of this data or that provide secure work-sharing and communication. Meanwhile, our ever more-connected world provides an increasingly receptive environment for diseases to emerge and spread rapidly making early warning and collaborative decision-making essential to saving lives and reducing the impact of outbreaks. Digital Infuzion's previous work on the Defense Threat Reduction Agency (DTRA)'s Biosurveillance Ecosystem (BSVE) built a cloud-based platform to ingest big data with analytics to provide users a robust surveillance environment. We next enhanced the BSVE data sources and analytics to support an integrated One Health paradigm. The resulting BSVE and Digital Infuzion's HARBINGER platform include: 1) identifying and ingesting data sources that span global human, animal and crop health; 2) inclusion of non-health data such as travel, weather, and infrastructure; 3) the data science tools, analytics and visualizations to make these data useful and 4) a fully-featured Collaboration Center for secure work-sharing and communication across agencies.IntroductionAfter the 2009 H1N1 pandemic, the Assistant Secretary of Defense for Nuclear, Chemical and Biological Defense indicated “biodefense” would include emerging infectious disease. In response, DTRA launched an initiative for an innovative, rapidly emerging capability to enable real-time biosurveillance for early warning and course of action analysis. Through competitive prototyping, DTRA selected Digital Infuzion to develop the platform and next generation analytics. This work was extended to enhance collaboration capabilities and to harness data science and advanced analytics for multi-disciplinary surveillance including climate, crop, and animal as well as human data. New analysis tools ensure the BSVE supports a One Health paradigm to best inform public health action. Digital Infuzion and DTRA first introduced the BSVE to the ISDS community at the 2013 annual conference SWAP Meet. Digital Infuzion is pleased to present the mature platform to this community again as it is now a fully developed capability undergoing FedRAMP certification with the Department of Homeland Security’s National Biosurveillance Integration Center and Is the basis for Digital Infuzion's HARBINGER ecosystem for biosurveillance.MethodsWe integrated over 170 global One Health data sources using cloud-based automated data ingestion workflows that provide unified access with data provenance. We used modular automated workflows to implement data science including Natural Language Processing (NLP), machine learning, anomaly detection, and expert systems for extraction of concepts from unstructured text. A first of its kind ontology for biosurveillance permits linking of data across sources. This ontology allows users to rapidly find all relevant data by looking at semantic relationships within and across data sets having varying quality, types, and usages to understand the best, most complete indicators of impending threats.We applied the following principles to the development of data science tools: 1) mathematics should be fully automated and operate 'under the hood' without need for user intervention; 2) 'At-a-Glance' visualizations should summarize Information, draw attention to key aspects and permit drill down into underlying data; 3) data science analytics and tools need to be validated with real-world data and by disease surveillance experts and 4) secure collaboration capabilities are essential to biosurveillance activities.This was a highly complex effort. We worked closely with surveillance analysts from multiple agencies and organizations to continuously guide the development of capabilities. We drew upon subject matter expertise in public health, machine learning, social media, NLP, semantics, big data integration, computational science, and visualization. A high level of automation, security and immediacy of data was applied to support rapid identification and investigation of potential outbreaks.ResultsThe platform now provisions integrated One Health information. Data sources were harmonized and expanded, along with historical information, to better predict and understand biothreats. These include global social media, human, plant, animal, and weather data. An Analyst Workbench delivers logical, intuitive and interactive visualizations enabling disease surveillance professionals to identify critical, predictive information without extensive manual research. Over 700 approved users currently have access to the prototype.Biosurveillance activities can be performed collaboratively among governmental agencies, public health officials, and the general public using the Collaboration Center and its sharing and messaging systems. Data sharing is HIPAA compliant and distinguishes public from private data using carefully controlled and approved role- and attribute-based access for security.To speed disease surveillance workflows, the workbench generates suggestions to the user on their current work. Anomaly detection to alert to potential developing disease events employs fully automated analytics to conduct over 43 million calculations daily for more than 500 diseases in over 170 data sources, distilling this into a table that ranks the most significant anomalous increases that may indicate an outbreak and warrant investigation.A predictive disease modeling tool based on current and historical data uses fuzzy logic to identify the likeliest outcome, even early in an outbreak when there is much uncertainty about the disease and its characteristics. A complex automated workflow identifies health-related topics that are trending in Twitter and evaluates their severity using novel lexicons and new reactive sentiment analysis. Searches use the ontology to gather all relevant information and are supported by the most advanced NLP with custom surveillance rules to provide succinctly extracted information. This alleviates the need for extensive reading by identifying exactly which data is needed and extracting key concepts from it. Intuitive methods of visual representation, interactive displays, and drill-down capabilities were leveraged in all analytics for rapid understanding of results.Finally, we added a software development kit to enable third party developers to continuously enhance the platform capabilities by adding new data sources and new analytic apps. This allows the platform to be adapted for specific needs and to keep pace with new scientific and technical discoveries and has resulted in over 50 analytic apps.ConclusionsThe addition of One Health data and analytics, and the integration of health data with unconventional data sources and modern approaches to data science and complex workflows, resulted in enhanced situational awareness and decision-making capabilities for users. The expanded Collaboration Center within the workbench, enables users to partner and collaborate with other agencies and biosurveillance professionals both nationally and internationally to maximize the rapidity of responses to serious disease outbreaks. |
---|---|
AbstractList | ObjectiveWhile there is a growing torrent of data that disease surveillance could leverage, few effective tools exist to help public health professionals make sense of this data or that provide secure work-sharing and communication. Meanwhile, our ever more-connected world provides an increasingly receptive environment for diseases to emerge and spread rapidly making early warning and collaborative decision-making essential to saving lives and reducing the impact of outbreaks. Digital Infuzion's previous work on the Defense Threat Reduction Agency (DTRA)'s Biosurveillance Ecosystem (BSVE) built a cloud-based platform to ingest big data with analytics to provide users a robust surveillance environment. We next enhanced the BSVE data sources and analytics to support an integrated One Health paradigm. The resulting BSVE and Digital Infuzion's HARBINGER platform include: 1) identifying and ingesting data sources that span global human, animal and crop health; 2) inclusion of non-health data such as travel, weather, and infrastructure; 3) the data science tools, analytics and visualizations to make these data useful and 4) a fully-featured Collaboration Center for secure work-sharing and communication across agencies.IntroductionAfter the 2009 H1N1 pandemic, the Assistant Secretary of Defense for Nuclear, Chemical and Biological Defense indicated “biodefense” would include emerging infectious disease. In response, DTRA launched an initiative for an innovative, rapidly emerging capability to enable real-time biosurveillance for early warning and course of action analysis. Through competitive prototyping, DTRA selected Digital Infuzion to develop the platform and next generation analytics. This work was extended to enhance collaboration capabilities and to harness data science and advanced analytics for multi-disciplinary surveillance including climate, crop, and animal as well as human data. New analysis tools ensure the BSVE supports a One Health paradigm to best inform public health action. Digital Infuzion and DTRA first introduced the BSVE to the ISDS community at the 2013 annual conference SWAP Meet. Digital Infuzion is pleased to present the mature platform to this community again as it is now a fully developed capability undergoing FedRAMP certification with the Department of Homeland Security’s National Biosurveillance Integration Center and Is the basis for Digital Infuzion's HARBINGER ecosystem for biosurveillance.MethodsWe integrated over 170 global One Health data sources using cloud-based automated data ingestion workflows that provide unified access with data provenance. We used modular automated workflows to implement data science including Natural Language Processing (NLP), machine learning, anomaly detection, and expert systems for extraction of concepts from unstructured text. A first of its kind ontology for biosurveillance permits linking of data across sources. This ontology allows users to rapidly find all relevant data by looking at semantic relationships within and across data sets having varying quality, types, and usages to understand the best, most complete indicators of impending threats.We applied the following principles to the development of data science tools: 1) mathematics should be fully automated and operate 'under the hood' without need for user intervention; 2) 'At-a-Glance' visualizations should summarize Information, draw attention to key aspects and permit drill down into underlying data; 3) data science analytics and tools need to be validated with real-world data and by disease surveillance experts and 4) secure collaboration capabilities are essential to biosurveillance activities.This was a highly complex effort. We worked closely with surveillance analysts from multiple agencies and organizations to continuously guide the development of capabilities. We drew upon subject matter expertise in public health, machine learning, social media, NLP, semantics, big data integration, computational science, and visualization. A high level of automation, security and immediacy of data was applied to support rapid identification and investigation of potential outbreaks.ResultsThe platform now provisions integrated One Health information. Data sources were harmonized and expanded, along with historical information, to better predict and understand biothreats. These include global social media, human, plant, animal, and weather data. An Analyst Workbench delivers logical, intuitive and interactive visualizations enabling disease surveillance professionals to identify critical, predictive information without extensive manual research. Over 700 approved users currently have access to the prototype.Biosurveillance activities can be performed collaboratively among governmental agencies, public health officials, and the general public using the Collaboration Center and its sharing and messaging systems. Data sharing is HIPAA compliant and distinguishes public from private data using carefully controlled and approved role- and attribute-based access for security.To speed disease surveillance workflows, the workbench generates suggestions to the user on their current work. Anomaly detection to alert to potential developing disease events employs fully automated analytics to conduct over 43 million calculations daily for more than 500 diseases in over 170 data sources, distilling this into a table that ranks the most significant anomalous increases that may indicate an outbreak and warrant investigation.A predictive disease modeling tool based on current and historical data uses fuzzy logic to identify the likeliest outcome, even early in an outbreak when there is much uncertainty about the disease and its characteristics. A complex automated workflow identifies health-related topics that are trending in Twitter and evaluates their severity using novel lexicons and new reactive sentiment analysis. Searches use the ontology to gather all relevant information and are supported by the most advanced NLP with custom surveillance rules to provide succinctly extracted information. This alleviates the need for extensive reading by identifying exactly which data is needed and extracting key concepts from it. Intuitive methods of visual representation, interactive displays, and drill-down capabilities were leveraged in all analytics for rapid understanding of results.Finally, we added a software development kit to enable third party developers to continuously enhance the platform capabilities by adding new data sources and new analytic apps. This allows the platform to be adapted for specific needs and to keep pace with new scientific and technical discoveries and has resulted in over 50 analytic apps.ConclusionsThe addition of One Health data and analytics, and the integration of health data with unconventional data sources and modern approaches to data science and complex workflows, resulted in enhanced situational awareness and decision-making capabilities for users. The expanded Collaboration Center within the workbench, enables users to partner and collaborate with other agencies and biosurveillance professionals both nationally and internationally to maximize the rapidity of responses to serious disease outbreaks. |
Author | Virkar, Hermant Borgman, JAcob Carson, Jeremy Shah, Amol Kola, Krishna Stark, Karen Somborac, Miko Hauser, Lauren |
Author_xml | – sequence: 1 givenname: Karen surname: Stark fullname: Stark, Karen – sequence: 2 givenname: Amol surname: Shah fullname: Shah, Amol – sequence: 3 givenname: JAcob surname: Borgman fullname: Borgman, JAcob – sequence: 4 givenname: Miko surname: Somborac fullname: Somborac, Miko – sequence: 5 givenname: Jeremy surname: Carson fullname: Carson, Jeremy – sequence: 6 givenname: Lauren surname: Hauser fullname: Hauser, Lauren – sequence: 7 givenname: Krishna surname: Kola fullname: Kola, Krishna – sequence: 8 givenname: Hermant surname: Virkar fullname: Virkar, Hermant |
BookMark | eNp1kM1KAzEURoNUsNbuXeYBnJqkk0yyEWqtWii4UNchkx-bMk1KMi307Z22Iip4N_fCx_kunEvQCzFYAK4xGlGC0W1cbZZ-tMPY45GoEDkDfSzKqiC0Er0f9wUY5rxC3YwrikvcB_MH1Sr4qr0N2t7ASVDNvvU6QxUMnMamUXVMqvUxQBcTVPDex7xNO-u7qEPgTMe8z61dX4Fzp5psh197AN4fZ2_T52Lx8jSfThaFxpySgtMSOYNtbcbEMGGwEUQjQUtBmWMME6Q4Rc46x5S2XCnDEK-54CUVlGIxHoC7U-9mW6-t0Ta0STVyk_xapb2MysvfSfBL-RF3kjHECMddAToV6BRzTtZ9sxjJg0551CmPOuVBZ4ewP4j27dFK98E3_4Of6n1_8g |
CitedBy_id | crossref_primary_10_18034_ajase_v10i1_17 crossref_primary_10_3390_pathogens10101348 |
ContentType | Journal Article |
Copyright | ISDS Annual Conference Proceedings 2019 2019 2019 the author(s) |
Copyright_xml | – notice: ISDS Annual Conference Proceedings 2019 2019 2019 the author(s) |
DBID | AAYXX CITATION 5PM |
DOI | 10.5210/ojphi.v11i1.9702 |
DatabaseName | CrossRef PubMed Central (Full Participant titles) |
DatabaseTitle | CrossRef |
DatabaseTitleList | CrossRef |
DeliveryMethod | fulltext_linktorsrc |
Discipline | Medicine |
EISSN | 1947-2579 |
ExternalDocumentID | PMC6606281 10_5210_ojphi_v11i1_9702 |
GroupedDBID | 5VS AAYXX ADBBV AFMMW ALMA_UNASSIGNED_HOLDINGS AOIJS BAWUL BCNDV CITATION DIK F5P FRP GROUPED_DOAJ GX1 H13 HYE KQ8 M48 M~E OK1 RNS RPM TR2 5PM |
ID | FETCH-LOGICAL-c1852-8540fd1ebd32d69d1d92c0954956f66120a850feff6ace8aad608b89845955193 |
IEDL.DBID | M48 |
ISSN | 1947-2579 |
IngestDate | Thu Aug 21 14:13:26 EDT 2025 Tue Jul 01 00:56:03 EDT 2025 Thu Apr 24 23:12:20 EDT 2025 |
IsDoiOpenAccess | true |
IsOpenAccess | true |
IsPeerReviewed | true |
IsScholarly | true |
Issue | 1 |
Language | English |
License | This is an open-access article distributed under the terms of the Creative Commons Attribution Non-Commercial (CC BY-NC) 4.0 License. |
LinkModel | DirectLink |
MergedId | FETCHMERGED-LOGICAL-c1852-8540fd1ebd32d69d1d92c0954956f66120a850feff6ace8aad608b89845955193 |
OpenAccessLink | http://journals.scholarsportal.info/openUrl.xqy?doi=10.5210/ojphi.v11i1.9702 |
ParticipantIDs | pubmedcentral_primary_oai_pubmedcentral_nih_gov_6606281 crossref_primary_10_5210_ojphi_v11i1_9702 crossref_citationtrail_10_5210_ojphi_v11i1_9702 |
ProviderPackageCode | CITATION AAYXX |
PublicationCentury | 2000 |
PublicationDate | 2019-05-30 20190530 |
PublicationDateYYYYMMDD | 2019-05-30 |
PublicationDate_xml | – month: 05 year: 2019 text: 2019-05-30 day: 30 |
PublicationDecade | 2010 |
PublicationTitle | Online journal of public health informatics |
PublicationYear | 2019 |
Publisher | University of Illinois at Chicago Library |
Publisher_xml | – name: University of Illinois at Chicago Library |
SSID | ssj0000375141 |
Score | 2.0721257 |
Snippet | ObjectiveWhile there is a growing torrent of data that disease surveillance could leverage, few effective tools exist to help public health professionals make... |
SourceID | pubmedcentral crossref |
SourceType | Open Access Repository Enrichment Source Index Database |
SubjectTerms | Abstract |
Title | Data Science, Analytics and Collaboration for a Biosurveillance Ecosystem |
URI | https://pubmed.ncbi.nlm.nih.gov/PMC6606281 |
Volume | 11 |
hasFullText | 1 |
inHoldings | 1 |
isFullTextHit | |
isPrint | |
journalDatabaseRights | – providerCode: PRVAFT databaseName: Open Access Digital Library customDbUrl: eissn: 1947-2579 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: KQ8 dateStart: 20090101 isFulltext: true titleUrlDefault: http://grweb.coalliance.org/oadl/oadl.html providerName: Colorado Alliance of Research Libraries – providerCode: PRVAON databaseName: DOAJ Directory of Open Access Journals customDbUrl: eissn: 1947-2579 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: DOA dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.doaj.org/ providerName: Directory of Open Access Journals – providerCode: PRVBFR databaseName: Free Medical Journals customDbUrl: eissn: 1947-2579 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: DIK dateStart: 20090101 isFulltext: true titleUrlDefault: http://www.freemedicaljournals.com providerName: Flying Publisher – providerCode: PRVFQY databaseName: GFMER Free Medical Journals customDbUrl: eissn: 1947-2579 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: GX1 dateStart: 20090101 isFulltext: true titleUrlDefault: http://www.gfmer.ch/Medical_journals/Free_medical.php providerName: Geneva Foundation for Medical Education and Research – providerCode: PRVHPJ databaseName: ROAD: Directory of Open Access Scholarly Resources customDbUrl: eissn: 1947-2579 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: M~E dateStart: 20090101 isFulltext: true titleUrlDefault: https://road.issn.org providerName: ISSN International Centre – providerCode: PRVAQN databaseName: PubMed Central customDbUrl: eissn: 1947-2579 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: RPM dateStart: 20090101 isFulltext: true titleUrlDefault: https://www.ncbi.nlm.nih.gov/pmc/ providerName: National Library of Medicine – providerCode: PRVFZP databaseName: Scholars Portal Journals: Open Access customDbUrl: eissn: 1947-2579 dateEnd: 20190531 omitProxy: true ssIdentifier: ssj0000375141 issn: 1947-2579 databaseCode: M48 dateStart: 20091201 isFulltext: true titleUrlDefault: http://journals.scholarsportal.info providerName: Scholars Portal |
link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV07T8MwELZQkRAL4inKo_LAgkRCnMSOPSEoLQWpTFTqFjl2rAZVSelL8O-xnVAaqWJgyRLf8vlx3-nuvgPgSkqMFBeho7kFdkIqE4cqJh3JfYIFEQTZ4vH-K-kNwpchHv62R1cAzjaGdmae1GA6dj8_vu70hdf81cWmA6V4n4wyd4lQhlwWGWXJbe2XfHPG-xXZt-9yEGl2gMpc5UbDmm-q10iuOZ3uPtir2CK8L7f3AGyl-SHY6Vf58CPw_MjnHFb38wZahRGjuwx5LmF7fYuhJqeQw4esmC2my9TMGtImsCOKUsv5GAy6nbd2z6mGIzjC9Ds7VFMtJVGayMCXhEkkmS88k7TDRGmn63ucYk-lShEuUsq5JB5NKKMhZtjQthPQyIs8PQWQMZwyGVHFuQ239NpAe_JABPrxU4w1we0PLLGolMPNAItxrCMIA2RsgYwtkLEBsgmuVxaTUjXjj7VRDemVgRG-rv_Js5EVwCbEdH6is39bnoNdTXyYrQLwLkBjPl2kl5pczJOWDcr192mIWvb8fAPFl9aX |
linkProvider | Scholars Portal |
openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Data+Science%2C+Analytics+and+Collaboration+for+a+Biosurveillance+Ecosystem&rft.jtitle=Online+journal+of+public+health+informatics&rft.au=Stark%2C+Karen+A.&rft.au=Shah%2C+Amol&rft.au=Borgman%2C+Jacob&rft.au=Somborac%2C+Miko&rft.date=2019-05-30&rft.pub=University+of+Illinois+at+Chicago+Library&rft.eissn=1947-2579&rft.volume=11&rft.issue=1&rft_id=info:doi/10.5210%2Fojphi.v11i1.9702&rft.externalDocID=PMC6606281 |
thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1947-2579&client=summon |
thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1947-2579&client=summon |
thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1947-2579&client=summon |