Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth
Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of s...
        Saved in:
      
    
          | Published in | 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines pp. 234 - 241 | 
|---|---|
| Main Authors | , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        01.05.2011
     | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 9781612842776 1612842771  | 
| DOI | 10.1109/FCCM.2011.12 | 
Cover
| Abstract | Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of simple soft-processors for high-performance stencil computation on multiple FPGAs. The SSA architecture allows a multi-device system to have linear scalability of computing performance by deeply pipelining with a constant bandwidth of an external-memory. We present an array-structure of programmable cores optimized for stencil computations and formulate a performance model of pipelined execution on the array. For Jacobi computations, SSA implemented on nine Stratix III FPGAs with the memory bandwidth of only 2 GB/s achieves 260 GFlop/s, corresponding to 87.4 % of its peak performance, at 1.3 GFlop/sW. We demonstrate that SSA provides almost linear speedup for larger than medium-sized computation as expected by the performance model. These high utilization and scalability show a big potential of custom computing on reconfigurable devices as a power-efficient and high-performance computing platform. | 
    
|---|---|
| AbstractList | Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on multi-core microprocessors and GPGPUs due to its small operationalintensity. In this paper, we propose a scalable streaming-array (SSA) of simple soft-processors for high-performance stencil computation on multiple FPGAs. The SSA architecture allows a multi-device system to have linear scalability of computing performance by deeply pipelining with a constant bandwidth of an external-memory. We present an array-structure of programmable cores optimized for stencil computations and formulate a performance model of pipelined execution on the array. For Jacobi computations, SSA implemented on nine Stratix III FPGAs with the memory bandwidth of only 2 GB/s achieves 260 GFlop/s, corresponding to 87.4 % of its peak performance, at 1.3 GFlop/sW. We demonstrate that SSA provides almost linear speedup for larger than medium-sized computation as expected by the performance model. These high utilization and scalability show a big potential of custom computing on reconfigurable devices as a power-efficient and high-performance computing platform. | 
    
| Author | Sano, K Hatsuda, Y Yamamoto, S  | 
    
| Author_xml | – sequence: 1 givenname: K surname: Sano fullname: Sano, K email: kentah@caero.mech.tohoku.ac.jp organization: Grad. Sch. of Inf. Sci., Tohoku Univ., Sendai, Japan – sequence: 2 givenname: Y surname: Hatsuda fullname: Hatsuda, Y email: hatsuda@caero.mech.tohoku.ac.jp organization: Grad. Sch. of Inf. Sci., Tohoku Univ., Sendai, Japan – sequence: 3 givenname: S surname: Yamamoto fullname: Yamamoto, S email: yamamoto@caero.mech.tohoku.ac.jp organization: Grad. Sch. of Inf. Sci., Tohoku Univ., Sendai, Japan  | 
    
| BookMark | eNotjM1KAzEYRSMqaGt37tzkBabmy-R3WQerQotCdV3SNLEpM0lJItK3t6J3c-Hcwx2hi5iiQ-gWyBSA6Pt51y2nlABMgZ6hEZFCc9YSYOdooqUCAVQxKqW4QpNS9uQUIXQrxDXar6zpzaZ3eFWzM0OIn80sZ3PEyeNVGA6_S_K1ecvJulJSLtinfLJdtKHHXRoOX9XUkGLB36HuTiSWamLFSzekfGweTNx-h23d3aBLb_riJv89Rh_zx_fuuVm8Pr10s0UTQPLaCK2s0AI8YQ40MdRoxqyjQnJJNFUAThGlFffSGMeI3G5k64XVwirJeduO0d3fb3DOrQ85DCYf11xKoFK3PxB5WcE | 
    
| ContentType | Conference Proceeding | 
    
| DBID | 6IE 6IL CBEJK RIE RIL  | 
    
| DOI | 10.1109/FCCM.2011.12 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Xplore POP ALL IEEE Xplore All Conference Proceedings IEEE Electronic Library (IEL) IEEE Proceedings Order Plans (POP All) 1998-Present  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| EISBN | 0769543014 9780769543017  | 
    
| EndPage | 241 | 
    
| ExternalDocumentID | 5771279 | 
    
| Genre | orig-research | 
    
| GroupedDBID | 6IE 6IF 6IK 6IL 6IN AAJGR AAWTH ADFMO ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK IERZE OCL RIE RIL  | 
    
| ID | FETCH-LOGICAL-i175t-698c6961f04e190a2a944ce26757092811e808985f7aae407db73f6c96c875533 | 
    
| IEDL.DBID | RIE | 
    
| ISBN | 9781612842776 1612842771  | 
    
| IngestDate | Wed Aug 27 02:53:36 EDT 2025 | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | false | 
    
| Language | English | 
    
| LinkModel | DirectLink | 
    
| MergedId | FETCHMERGED-LOGICAL-i175t-698c6961f04e190a2a944ce26757092811e808985f7aae407db73f6c96c875533 | 
    
| PageCount | 8 | 
    
| ParticipantIDs | ieee_primary_5771279 | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2011-May | 
    
| PublicationDateYYYYMMDD | 2011-05-01 | 
    
| PublicationDate_xml | – month: 05 year: 2011 text: 2011-May  | 
    
| PublicationDecade | 2010 | 
    
| PublicationTitle | 2011 IEEE 19th Annual International Symposium on Field-Programmable Custom Computing Machines | 
    
| PublicationTitleAbbrev | fccm | 
    
| PublicationYear | 2011 | 
    
| Publisher | IEEE | 
    
| Publisher_xml | – name: IEEE | 
    
| SSID | ssj0000669366 ssib026766457  | 
    
| Score | 1.558252 | 
    
| Snippet | Stencil computation is one of the important kernels in scientific computations, however, the sustained performance is limited by memory bandwidth especially on... | 
    
| SourceID | ieee | 
    
| SourceType | Publisher | 
    
| StartPage | 234 | 
    
| SubjectTerms | Arrays Bandwidth computation computation Delay Field programmable gate arrays FPGA High-performance stencil Pipeline processing Scalability scalable streaming-array  | 
    
| Title | Scalable Streaming-Array of Simple Soft-Processors for Stencil Computations with Constant Memory-Bandwidth | 
    
| URI | https://ieeexplore.ieee.org/document/5771279 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NS8MwFA9zJ08qm_hNDh7t7OdLctTiGEJFmIPdRtqkWD9aqR0y_3pfmm6KePDWNm1JX8PL7339HiHnXqZYzlRgeFsDB_Gt60iTuKb8SKk8ynXW-iGTO5jMwtt5NO-Ri00tjNa6TT7TI3PYxvJVlS2NqwyNd-b5TGyRLcbB1mqt144PDCDsInhWC4MIAEwtFxgd7OOzHcXT-hw2ifDichzHiSX0NL0pfzRaafeZ8Q5J1jO06SXPo2WTjrLPX-SN__2EXTL8ruij95u9ao_0dDkgT1P8QaZ0iprYtHzFEeeqruWKVjmdFoY3mE5RTTtdNUFVv1PEuHg3vq94obYjhHX5UePQpbFFmw1NTALvyrmWpfooVPM4JLPxzUM8cbreC06BgKJxQPAMBHi5G2rEDNKXIgwzjWKOmCt87nmau1zwKGdSarQKVcqCHDIBGVpAiCH3Sb-sSn1AaKgUokS080IhQxzlwFPT6wy81NDXpYdkYOS0eLP0GotOREd_Xz4m29ata3IOT0i_qZf6FHFBk561C-ILjBaySg | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3NT4MwFG_mPOhJzWb8tgePMvkohR6VuEwdi8m2xNtS2hLxAwyymPnX-0rZNMaDN6BAyqN5_b2v30PozBEySAPpad5WzwJ8a1tcJ65J15cy9VMlaj9kPKKDKbl98B9a6HxVC6OUqpPPVE8f1rF8WYi5dpWB8R44bsDW0LpPCPFNtdZy9bg0oJQ0MTyjhynzKNXVXFRrYReebkielud0lQrPLvpRFBtKT92d8kerlXqn6W-heDlHk2Dy3JtXSU98_qJv_O9HbKPud00fvl_tVjuopfIOehrDL9LFU1hHp_krjFiXZckXuEjxONPMwXgMitpq6gmK8h0DyoW74X3ZCzY9IYzTD2uXLo4M3qxwrFN4F9YVz-VHJqvHLpr2ryfRwGq6L1gZQIrKoiwUlFEntYkC1MBdzggRCsTsBzZzQ8dRoR2y0E8DzhXYhTIJvJQKRgXYQIAid1E7L3K1hzCREnAiWHqEcQKjIQ0T3e2MOokmsEv2UUfLafZmCDZmjYgO_r58ijYGk3g4G96M7g7RpnHy6gzEI9Suyrk6BpRQJSf14vgCFoq1lw | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2011+IEEE+19th+Annual+International+Symposium+on+Field-Programmable+Custom+Computing+Machines&rft.atitle=Scalable+Streaming-Array+of+Simple+Soft-Processors+for+Stencil+Computations+with+Constant+Memory-Bandwidth&rft.au=Sano%2C+K&rft.au=Hatsuda%2C+Y&rft.au=Yamamoto%2C+S&rft.date=2011-05-01&rft.pub=IEEE&rft.isbn=9781612842776&rft.spage=234&rft.epage=241&rft_id=info:doi/10.1109%2FFCCM.2011.12&rft.externalDocID=5771279 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781612842776/lc.gif&client=summon&freeimage=true | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781612842776/mc.gif&client=summon&freeimage=true | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781612842776/sc.gif&client=summon&freeimage=true |