Accelerating joint species distribution modelling with Hmsc-HPC by GPU porting
Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally...
        Saved in:
      
    
          | Published in | PLoS computational biology Vol. 20; no. 9; p. e1011914 | 
|---|---|
| Main Authors | , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        United States
          Public Library of Science
    
        03.09.2024
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1553-7358 1553-734X 1553-7358  | 
| DOI | 10.1371/journal.pcbi.1011914 | 
Cover
| Summary: | Joint species distribution modelling (JSDM) is a widely used statistical method that analyzes combined patterns of all species in a community, linking empirical data to ecological theory and enhancing community-wide prediction tasks. However, fitting JSDMs to large datasets is often computationally demanding and time-consuming. Recent studies have introduced new statistical and machine learning techniques to provide more scalable fitting algorithms, but extending these to complex JSDM structures that account for spatial dependencies or multi-level sampling designs remains challenging. In this study, we aim to enhance JSDM scalability by leveraging high-performance computing (HPC) resources for an existing fitting method. Our work focuses on the
Hmsc
R-package, a widely used JSDM framework that supports the integration of various dataset types into a single comprehensive model. We developed a GPU-compatible implementation of its model-fitting algorithm using Python and the
TensorFlow
library. Despite these changes, our enhanced framework retains the original user interface of the
Hmsc
R-package. We evaluated the performance of the proposed implementation across various model configurations and dataset sizes. Our results show a significant increase in model fitting speed for most models compared to the baseline
Hmsc
R-package. For the largest datasets, we achieved speed-ups of over 1000 times, demonstrating the substantial potential of GPU porting for previously CPU-bound JSDM software. This advancement opens promising opportunities for better utilizing the rapidly accumulating new biodiversity data resources for inference and prediction. | 
|---|---|
| Bibliography: | new_version ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 23  | 
| ISSN: | 1553-7358 1553-734X 1553-7358  | 
| DOI: | 10.1371/journal.pcbi.1011914 |