基于异构平台的BH算法高效并行实现
针对多核CPU和众核加速器或协处理器异构平台的架构特征进行了研究,以MPI和OpenMP混合编程模型实现了N体问题BH算法的并行,采用了正交递归二分法(ORB)使进程之间负载均衡,并对程序进行了并行优化和MIC加速。优化和加速后的程序性能提升到原版本的3.4倍以上,其中MIC加速后性能提升到加速前的1.7倍;程序具有较好的扩展性,计算粒子规模达到上亿时,可扩展到32个节点共4480核心(640个CPU核心和3840个MIC核心)。...
Saved in:
| Published in | 计算机应用研究 Vol. 33; no. 8; pp. 2255 - 2259 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | Chinese |
| Published |
中国科学院大学,北京 100049%中国科学院计算机网络信息中心 超级计算中心,北京,100190
2016
中国科学院计算机网络信息中心 超级计算中心,北京 100190 |
| Subjects | |
| Online Access | Get full text |
| ISSN | 1001-3695 |
| DOI | 10.3969/j.issn.1001-3695.2016.08.003 |
Cover
| Summary: | 针对多核CPU和众核加速器或协处理器异构平台的架构特征进行了研究,以MPI和OpenMP混合编程模型实现了N体问题BH算法的并行,采用了正交递归二分法(ORB)使进程之间负载均衡,并对程序进行了并行优化和MIC加速。优化和加速后的程序性能提升到原版本的3.4倍以上,其中MIC加速后性能提升到加速前的1.7倍;程序具有较好的扩展性,计算粒子规模达到上亿时,可扩展到32个节点共4480核心(640个CPU核心和3840个MIC核心)。 |
|---|---|
| Bibliography: | 51-1196/TP Studying the architecture' s characteristics of the multi-core CPU and accelerators or coprocessors heterogeneous platforms, this paper was about the parallel implementation of N-body BH algorithm with hybrid MPI and OpenMP programming model. It used orthogonal recursive bisection (ORB) to balance load between processors, then carefully optimized the code on multi-core CPU and accelerated it on MIC. Testing result shows, after optimizing and accelerating, the code' s performance rea- ches above 3.4x speedup than original version and gets a 1.7x speedup than only running on muhi-core CPU. The code also has a good scalability with a 100 million particles running on a 32 nodes cluster, which has 4 480 cores (640 CPU cores and 3 840 MIC cores). Li Chanyi, Wang Wu , Feng Yangde , Xie Li (1. Supercomputing Center, Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China; 2. University of Chinese Academy of Sciences, Beijing 100049, China) N-body problem; BH algorithm; heteroge |
| ISSN: | 1001-3695 |
| DOI: | 10.3969/j.issn.1001-3695.2016.08.003 |