Adaptive Task Aggregation for High-Performance Sparse Solvers on GPUs

Sparse solvers are heavily used in computational fluid dynamics (CFD), computer-aided design (CAD), and other important application domains. These solvers remain challenging to execute on massively parallel architectures, due to the sequential dependencies between the fine-grained application tasks....

Full description

Saved in:

Bibliographic Details
Published in	Proceedings / International Conference on Parallel Architectures and Compilation Techniques pp. 324 - 336
Main Authors	Helal, Ahmed E., Aji, Ashwin M., Chu, Michael L., Beckmann, Bradford M., Feng, Wu-chun
Format	Conference Proceeding
Language	English
Published	IEEE 01.09.2019
Subjects	Computational modeling Computer architecture data dependency fine-grained parallelism GPUs Graphics processing units Kernel Processor scheduling runtime adaptation scheduling sparse linear algebra Task analysis task parallel execution
Online Access	Get full text
ISSN	2641-7936
DOI	10.1109/PACT.2019.00033

Cover

More Information
Summary:	Sparse solvers are heavily used in computational fluid dynamics (CFD), computer-aided design (CAD), and other important application domains. These solvers remain challenging to execute on massively parallel architectures, due to the sequential dependencies between the fine-grained application tasks. In particular, parallel sparse solvers typically suffer from substantial scheduling and dependency-management overheads relative to the compute operations. We propose adaptive task aggregation (ATA) to efficiently execute such irregular computations on GPU architectures via hierarchical dependency management and low-latency task scheduling. On a gamut of representative problems with different data-dependency structures, ATA significantly outperforms existing GPU task-execution approaches, achieving a geometric mean speedup of 2.2X to 3.7X across different sparse kernels (with speedups of up to two orders of magnitude).
ISSN:	2641-7936
DOI:	10.1109/PACT.2019.00033