Accelerating joint species distribution modelling with Hmsc-HPC by GPU porting

Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally...

Full description

Saved in:
Bibliographic Details
Published inPLoS computational biology Vol. 20; no. 9; p. e1011914
Main Authors Rahman, Anis Ur, Tikhonov, Gleb, Oksanen, Jari, Rossi, Tuomas, Ovaskainen, Otso
Format Journal Article
LanguageEnglish
Published United States Public Library of Science 03.09.2024
Subjects
Online AccessGet full text
ISSN1553-7358
1553-734X
1553-7358
DOI10.1371/journal.pcbi.1011914

Cover

More Information
Summary:Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally demanding and time-consuming. Recent studies have introduced new statistical and machine learning techniques to provide more scalable fitting algorithms, but extending these to complex JSDM structures that account for spatial dependencies or multi-level sampling designs remains challenging. In this study, we aim to enhance JSDM scalability by leveraging high-performance computing (HPC) resources for an existing fitting method. Our work focuses on the Hmsc R-package, a widely used JSDM framework that supports the integration of various dataset types into a single comprehensive model. We developed a GPU-compatible implementation of its model-fitting algorithm using Python and the TensorFlow library. Despite these changes, our enhanced framework retains the original user interface of the Hmsc R-package. We evaluated the performance of the proposed implementation across various model configurations and dataset sizes. Our results show a significant increase in model fitting speed for most models compared to the baseline Hmsc R-package. For the largest datasets, we achieved speed-ups of over 1000 times, demonstrating the substantial potential of GPU porting for previously CPU-bound JSDM software. This advancement opens promising opportunities for better utilizing the rapidly accumulating new biodiversity data resources for inference and prediction.
Bibliography:new_version
ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1553-7358
1553-734X
1553-7358
DOI:10.1371/journal.pcbi.1011914