Scalable NIC-based Reduction on Large-scale Clusters

Many parallel algorithms require efficient reduction collectives. In response, researchers have designed algorithms considering a range of parameters including data size, system size, and communication characteristics. Throughout this past work, however, processing was limited to the host CPU. Today...

Full description

Saved in:
Bibliographic Details
Published inProceedings of the 2003 ACM/IEEE conference on Supercomputing p. 59
Main Authors Moody, Adam, Fernandez, Juan, Petrini, Fabrizio, Panda, Dhabaleswar K.
Format Conference Proceeding
LanguageEnglish
Published New York, NY, USA ACM 15.11.2003
IEEE
SeriesACM Conferences
Subjects
Online AccessGet full text
ISBN9781581136951
1581136951
DOI10.1145/1048935.1050209

Cover

More Information
Summary:Many parallel algorithms require efficient reduction collectives. In response, researchers have designed algorithms considering a range of parameters including data size, system size, and communication characteristics. Throughout this past work, however, processing was limited to the host CPU. Today, modern Network Interface Cards (NICs) sport programmable processors with substantial memory, and thus introduce a fresh variable into the equation. In this paper, we investigate this new option in the context of large-scale clusters. Through experiments on the 960-node, 1920-processor ASCI Linux Cluster (ALC) at Lawrence Livermore National Laboratory, we show that NIC-based reductions outperform host-based algorithms in terms of reduced latency and increased consistency. In particular, in the largest configuration tested - 1812 processors - our NIC-based algorithm summed single-element vectors of 32-bit integers and 64-bit floating-point numbers in 73 µs and 118 µs, respectively. These results represent respective improvements of 121% and 39% over the production-level MPI library.
Bibliography:SourceType-Conference Papers & Proceedings-1
ObjectType-Conference Paper-1
content type line 25
ISBN:9781581136951
1581136951
DOI:10.1145/1048935.1050209