Accelerating joint species distribution modelling with Hmsc-HPC by GPU porting
Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally...
Saved in:
| Published in | PLoS computational biology Vol. 20; no. 9; p. e1011914 |
|---|---|
| Main Authors | , , , , |
| Format | Journal Article |
| Language | English |
| Published |
United States
Public Library of Science
03.09.2024
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 1553-7358 1553-734X 1553-7358 |
| DOI | 10.1371/journal.pcbi.1011914 |
Cover
| Summary: | Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally demanding and time-consuming. Recent studies have introduced new statistical and machine learning techniques to provide more scalable fitting algorithms, but extending these to complex JSDM structures that account for spatial dependencies or multi-level sampling designs remains challenging. In this study, we aim to enhance JSDM scalability by leveraging high-performance computing (HPC) resources for an existing fitting method. Our work focuses on the
Hmsc
R-package, a widely used JSDM framework that supports the integration of various dataset types into a single comprehensive model. We developed a GPU-compatible implementation of its model-fitting algorithm using Python and the
TensorFlow
library. Despite these changes, our enhanced framework retains the original user interface of the
Hmsc
R-package. We evaluated the performance of the proposed implementation across various model configurations and dataset sizes. Our results show a significant increase in model fitting speed for most models compared to the baseline
Hmsc
R-package. For the largest datasets, we achieved speed-ups of over 1000 times, demonstrating the substantial potential of GPU porting for previously CPU-bound JSDM software. This advancement opens promising opportunities for better utilizing the rapidly accumulating new biodiversity data resources for inference and prediction. |
|---|---|
| Bibliography: | new_version ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23 |
| ISSN: | 1553-7358 1553-734X 1553-7358 |
| DOI: | 10.1371/journal.pcbi.1011914 |