Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth
Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of s...
Saved in:
| Published in | 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines pp. 234 - 241 |
|---|---|
| Main Authors | , , |
| Format | Conference Proceeding |
| Language | English |
| Published |
IEEE
01.05.2011
|
| Subjects | |
| Online Access | Get full text |
| ISBN | 9781612842776 1612842771 |
| DOI | 10.1109/FCCM.2011.12 |
Cover
| Summary: | Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of simple soft-processors for high-performance stencil computation on multiple FPGAs. The SSA architecture allows a multi-device system to have linear scalability of computing performance by deeply pipelining with a constant bandwidth of an external-memory. We present an array-structure of programmable cores optimized for stencil computations and formulate a performance model of pipelined execution on the array. For Jacobi computations, SSA implemented on nine Stratix III FPGAs with the memory bandwidth of only 2 GB/s achieves 260 GFlop/s, corresponding to 87.4 % of its peak performance, at 1.3 GFlop/sW. We demonstrate that SSA provides almost linear speedup for larger than medium-sized computation as expected by the performance model. These high utilization and scalability show a big potential of custom computing on reconfigurable devices as a power-efficient and high-performance computing platform. |
|---|---|
| ISBN: | 9781612842776 1612842771 |
| DOI: | 10.1109/FCCM.2011.12 |