Dynamic task scheduling for linear algebra algorithms on distributed-memory multicore systems

This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a task-based library to replace the existing linear algebra subroutines such as PBLAS to transparently provide the same interf...

Full description

Saved in:

Bibliographic Details
Published in	Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis pp. 1 - 11
Main Authors	Song, Fengguang, YarKhan, Asim, Dongarra, Jack
Format	Conference Proceeding
Language	English
Published	New York, NY, USA ACM 14.11.2009
Series	ACM Conferences
Subjects	Algorithms Computing methodologies > Symbolic and algebraic manipulation > Symbolic and algebraic algorithms > Linear algebra algorithms Dynamic scheduling General and reference > Cross-computing tools and techniques > Performance Heuristic algorithms Instruction sets Libraries Linear algebra Mathematics of computing > Mathematical analysis > Numerical analysis > Computations on matrices Message systems Multicore processing Runtime Scalability Software and its engineering > Software creation and management > Software verification and validation > Operational analysis Software and its engineering > Software organization and properties > Contextual software domains > Operating systems > Process management > Scheduling Theory of computation > Design and analysis of algorithms > Approximation algorithms analysis > Scheduling algorithms Theory of computation > Design and analysis of algorithms > Online algorithms > Online learning algorithms > Scheduling algorithms Theory of computation > Theory and algorithms for application domains > Machine learning theory > Reinforcement learning > Sequential decision making
Online Access	Get full text
ISBN	1605587443 9781605587448
ISSN	2167-4329
DOI	10.1145/1654059.1654079

Cover

More Information
Summary:	This paper presents a dynamic task scheduling approach to executing dense linear algebra algorithms on multicore systems (either shared-memory or distributed-memory). We use a task-based library to replace the existing linear algebra subroutines such as PBLAS to transparently provide the same interface and computational function as the ScaLAPACK library. Linear algebra programs are written with the task-based library and executed by a dynamic runtime system. We mainly focus our runtime system design on the metric of performance scalability. We propose a distributed algorithm to solve data dependences without process cooperation. We have implemented the runtime system and applied it to three linear algebra algorithms: Cholesky, LU, and QR factorizations. Our experiments on both shared-memory machines (16, 32 cores) and distributed-memory machines (1024 cores) demonstrate that our runtime system is able to achieve good scalability. Furthermore, we provide analytical analysis to show why the tiled algorithms are scalable and the expected execution time.
ISBN:	1605587443 9781605587448
ISSN:	2167-4329
DOI:	10.1145/1654059.1654079