DAPPER data aware approximate NoC for GPGPU architectures
High interconnect bandwidth is crucial to achieve better performance in many-core GPGPU architectures that execute highly data parallel applications. The parallel warps of threads running on shader cores generate a high volume of read requests to the main memory due to the limited availability of da...
        Saved in:
      
    
          | Published in | Proceedings of the Twelfth IEEE/ACM International Symposium on Networks-on-Chip pp. 1 - 8 | 
|---|---|
| Main Authors | , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
        Piscataway, NJ, USA
          IEEE Press
    
        04.10.2018
     | 
| Series | ACM Conferences | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 1538648938 9781538648933  | 
| DOI | 10.5555/3306619.3306626 | 
Cover
| Abstract | High interconnect bandwidth is crucial to achieve better performance in many-core GPGPU architectures that execute highly data parallel applications. The parallel warps of threads running on shader cores generate a high volume of read requests to the main memory due to the limited availability of data cache space at the shader cores. This leads to scenarios with rapid arrival of reply data from the DRAM, which creates a bottleneck at memory controllers (MCs) that send reply packets back to the requesting cores over the NoC. Coping with such high volumes of data requires NoC architectures that possess high power overhead. To accomplish high bandwidth and low energy communication in GPGPUs, we propose Dapper, a data-aware approximate NoC architecture that increases the utilization of the available bandwidth by using low power single cycle overlay circuits for the reply traffic between MCs and shader cores. Dapper also incorporates a novel MC architecture that leverages the inherent approximability of the data values of certain applications and reduces the number of reply packets injected into the NoC by the MCs. Experimental results show that Dapper reduces the energy consumed in the GPGPU by up to 50% with up to 99% application output accuracy and minimum performance overheads compared to a state-of-the-art approximate NoC architectures. | 
    
|---|---|
| AbstractList | High interconnect bandwidth is crucial to achieve better performance in many-core GPGPU architectures that execute highly data parallel applications. The parallel warps of threads running on shader cores generate a high volume of read requests to the main memory due to the limited availability of data cache space at the shader cores. This leads to scenarios with rapid arrival of reply data from the DRAM, which creates a bottleneck at memory controllers (MCs) that send reply packets back to the requesting cores over the NoC. Coping with such high volumes of data requires NoC architectures that possess high power overhead. To accomplish high bandwidth and low energy communication in GPGPUs, we propose Dapper, a data-aware approximate NoC architecture that increases the utilization of the available bandwidth by using low power single cycle overlay circuits for the reply traffic between MCs and shader cores. Dapper also incorporates a novel MC architecture that leverages the inherent approximability of the data values of certain applications and reduces the number of reply packets injected into the NoC by the MCs. Experimental results show that Dapper reduces the energy consumed in the GPGPU by up to 50% with up to 99% application output accuracy and minimum performance overheads compared to a state-of-the-art approximate NoC architectures. | 
    
| Author | Raparti, Venkata Yaswanth Pasricha, Sudeep  | 
    
| Author_xml | – sequence: 1 givenname: Venkata Yaswanth surname: Raparti fullname: Raparti, Venkata Yaswanth email: yaswanth@rams.colostate.edu organization: Colorado State University – sequence: 2 givenname: Sudeep surname: Pasricha fullname: Pasricha, Sudeep email: sudeep@colostate.edu organization: Colorado State University  | 
    
| BookMark | eNqNjs1OAjEURm8CGgFZ-wpuZrjt7d9dEuTHhERidN20pU0AZRLH948jzAPwbc7my8kZw_DcnDPAk8Bad5sRoTGC6wulGcBYaHJGOSb3ANO2PSKiNE4bLUZw_zLf7Zbvj3BXwlebpz0n8Llafiw21fZt_bqYb6sgtP2t2KSwp6RicCrbIpgd2RQ1WxRqL2UhLDZ0YpZBRlZRZ1dyTpSZkUKiCTxfvSF9-9g0p9YL9P_dvu_2fXd3rW-8-vhzyIX-AH3EQoI | 
    
| ContentType | Conference Proceeding | 
    
| DOI | 10.5555/3306619.3306626 | 
    
| DatabaseTitleList | |
| DeliveryMethod | fulltext_linktorsrc | 
    
| EndPage | 8 | 
    
| GroupedDBID | 6IE 6IF 6IL 6IN AAJGR ABLEC ALMA_UNASSIGNED_HOLDINGS BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK IEGSK OCL RIE RIL  | 
    
| ID | FETCH-LOGICAL-a157t-96cad3c4ba84e7f199837cb597014d22f30f7a65192a2b94b5e8feec3e9903ac3 | 
    
| ISBN | 1538648938 9781538648933  | 
    
| IngestDate | Wed Jan 31 06:48:12 EST 2024 Wed Jan 31 06:53:49 EST 2024  | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | false | 
    
| Keywords | approximate computing network-on-chip GPGPU  | 
    
| Language | English | 
    
| LinkModel | OpenURL | 
    
| MeetingName | NOCS '18: International Symposium on Networks-on-Chip | 
    
| MergedId | FETCHMERGED-LOGICAL-a157t-96cad3c4ba84e7f199837cb597014d22f30f7a65192a2b94b5e8feec3e9903ac3 | 
    
| PageCount | 8 | 
    
| ParticipantIDs | acm_books_10_5555_3306619_3306626_brief acm_books_10_5555_3306619_3306626  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 20181004 | 
    
| PublicationDateYYYYMMDD | 2018-10-04 | 
    
| PublicationDate_xml | – month: 10 year: 2018 text: 20181004 day: 04  | 
    
| PublicationDecade | 2010 | 
    
| PublicationPlace | Piscataway, NJ, USA | 
    
| PublicationPlace_xml | – name: Piscataway, NJ, USA | 
    
| PublicationSeriesTitle | ACM Conferences | 
    
| PublicationTitle | Proceedings of the Twelfth IEEE/ACM International Symposium on Networks-on-Chip | 
    
| PublicationYear | 2018 | 
    
| Publisher | IEEE Press | 
    
| Publisher_xml | – name: IEEE Press | 
    
| SSID | ssj0002685651 | 
    
| Score | 1.755141 | 
    
| Snippet | High interconnect bandwidth is crucial to achieve better performance in many-core GPGPU architectures that execute highly data parallel applications. The... | 
    
| SourceID | acm | 
    
| SourceType | Publisher | 
    
| StartPage | 1 | 
    
| Subtitle | data aware approximate NoC for GPGPU architectures | 
    
| Title | DAPPER | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV3NS8MwFA8qHrwpTvxGQfAwomnTNunBg8yJCMrATfRUkjRB0FWxFdG_3vfWrp1j4EcPaRvKa5JfSN53CDnwmQEu3qVUcMNoYIWjMhQeVUx51qV-EHIMFL66ji4GweVdeNeczTmKLin0kfmcGVfyH1ShDnDFKNk_IFsThQp4BnyhBIShnGJ-Z-4zvboyH5v6--_2yRUPbZThMFVC52pK53fzMUQ_rbchWgmuSx_wnD5ntFN5M4_nz9lpr9edMAO9jC3_tzZ7VIVq36v8HWCp1ck9lb-iD37p65NaW5LDgbD5CbakiS78pmzwysyvjbIRG__NN0SXS2aEKWzkxDroTW6os1bqEC4YBg4SC4hwR6O7H82TeSFZGYVXa8r8SALX6WFUXv2rKlnX-J2XSZuQ6PEUSeQ6zHCCZ-gvk1bT370GqhUyZ7NVsliOb4sMzrv9zgWtzqSgygtFQePIqJSbQCuJExsjFLkwGsQykDVT33ecOaGgubGvfB0HOrTSWWu4hW2fK8PXyEL2nNl1sqec9ZgNJZ7eFKQmiqViIjWS6ZALpuINsg8tT3C-5QnIati7pOpdUvVugxz--E2iAWe3-QtqW2SpAX2bLBSvb3YHeK5C744g-QKoXR_g | 
    
| linkProvider | IEEE | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=Proceedings+of+the+Twelfth+IEEE%2FACM+International+Symposium+on+Networks-on-Chip&rft.atitle=DAPPER&rft.au=Raparti%2C+Venkata+Yaswanth&rft.au=Pasricha%2C+Sudeep&rft.series=ACM+Conferences&rft.date=2018-10-04&rft.pub=IEEE+Press&rft.isbn=1538648938&rft.spage=1&rft.epage=8&rft_id=info:doi/10.5555%2F3306619.3306626 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538648933/lc.gif&client=summon&freeimage=true | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538648933/mc.gif&client=summon&freeimage=true | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=9781538648933/sc.gif&client=summon&freeimage=true |