Evaluating Performance of Task and Data Coarsening in Concurrent Collections

Programmers are faced with many challenges for obtaining performance on machines with increasingly capable, yet increasingly complex hardware. A trend towards task-parallel and asynchronous many-task programming models aim to alleviate the burden of parallel programming on a vast array of current an...

Full description

Saved in:
Bibliographic Details
Published inLanguages and Compilers for Parallel Computing Vol. 10136; pp. 331 - 345
Main Authors Liu, Chenyang, Kulkarni, Milind
Format Book Chapter
LanguageEnglish
Published Switzerland Springer International Publishing AG 2017
Springer International Publishing
SeriesLecture Notes in Computer Science
Subjects
Online AccessGet full text
ISBN3319527088
9783319527086
ISSN0302-9743
1611-3349
DOI10.1007/978-3-319-52709-3_24

Cover

More Information
Summary:Programmers are faced with many challenges for obtaining performance on machines with increasingly capable, yet increasingly complex hardware. A trend towards task-parallel and asynchronous many-task programming models aim to alleviate the burden of parallel programming on a vast array of current and future platforms. One such model, Concurrent Collections (CnC), provides a programming paradigm that emphasizes the separation of the concerns–domain experts concentrate on their algorithms and correctness, whereas performance experts handle mapping and tuning to a target platform. Deep understanding of parallel constructs and behavior is not necessary to write parallel applications that will run on various multi-threaded and multi-core platforms when using the CnC model. However, performance can vary greatly depending on the granularity of tasks and data declared by the programmer. These program-specific decisions are not part of the CnC tuning capabilities and must be tuned in the program. We analyze the performance behavior based on tuning various elements in each collection for the LULESH application using CnC. We demonstrate the effects of different techniques to modify task and data granularity in CnC collections. Our fully tiled CnC implementation outperforms the OpenMP counterpart by 3 $$\times $$ for 48 processors. Finally, we propose guidelines to emulate the techniques used to obtain high performance while improving programmability.
Bibliography:Original Abstract: Programmers are faced with many challenges for obtaining performance on machines with increasingly capable, yet increasingly complex hardware. A trend towards task-parallel and asynchronous many-task programming models aim to alleviate the burden of parallel programming on a vast array of current and future platforms. One such model, Concurrent Collections (CnC), provides a programming paradigm that emphasizes the separation of the concerns–domain experts concentrate on their algorithms and correctness, whereas performance experts handle mapping and tuning to a target platform. Deep understanding of parallel constructs and behavior is not necessary to write parallel applications that will run on various multi-threaded and multi-core platforms when using the CnC model. However, performance can vary greatly depending on the granularity of tasks and data declared by the programmer. These program-specific decisions are not part of the CnC tuning capabilities and must be tuned in the program. We analyze the performance behavior based on tuning various elements in each collection for the LULESH application using CnC. We demonstrate the effects of different techniques to modify task and data granularity in CnC collections. Our fully tiled CnC implementation outperforms the OpenMP counterpart by 3\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$\times $$\end{document} for 48 processors. Finally, we propose guidelines to emulate the techniques used to obtain high performance while improving programmability.
ISBN:3319527088
9783319527086
ISSN:0302-9743
1611-3349
DOI:10.1007/978-3-319-52709-3_24