Hierarchical redesign of classic MPI reduction algorithms
Optimization of MPI collective communication operations has been an active research topic since the advent of MPI in 1990s. Many general and architecture-specific collective algorithms have been proposed and implemented in the state-of-the-art MPI implementations. Hierarchical topology-oblivious tra...
Saved in:
| Published in | The Journal of supercomputing Vol. 73; no. 2; pp. 713 - 725 |
|---|---|
| Main Authors | , |
| Format | Journal Article |
| Language | English |
| Published |
New York
Springer US
01.02.2017
Springer Nature B.V |
| Subjects | |
| Online Access | Get full text |
| ISSN | 0920-8542 1573-0484 |
| DOI | 10.1007/s11227-016-1779-7 |
Cover
| Summary: | Optimization of MPI collective communication operations has been an active research topic since the advent of MPI in 1990s. Many general and architecture-specific collective algorithms have been proposed and implemented in the state-of-the-art MPI implementations. Hierarchical topology-oblivious transformation of existing communication algorithms has been recently proposed as a new promising approach to optimization of MPI collective communication algorithms and MPI-based applications. This approach has been successfully applied to the most popular parallel matrix multiplication algorithm, SUMMA, and the state-of-the-art MPI broadcast algorithms, demonstrating significant multifold performance gains, especially for large-scale HPC systems. In this paper, we apply this approach to optimization of the MPI Reduce and Allreduce operations. Theoretical analysis and experimental results on a cluster of Grid’5000 platform are presented. |
|---|---|
| Bibliography: | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14 |
| ISSN: | 0920-8542 1573-0484 |
| DOI: | 10.1007/s11227-016-1779-7 |