Vicis a reliable network for unreliable silicon
Process scaling has given designers billions of transistors to work with. As feature sizes near the atomic scale, extensive variation and wearout inevitably make margining uneconomical or impossible. The ElastIC project seeks to address this by creating a large-scale chip-multiprocessor that can sel...
        Saved in:
      
    
          | Published in | 2009 46th ACM/IEEE Design Automation Conference pp. 812 - 817 | 
|---|---|
| Main Authors | , , , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
        New York, NY, USA
          ACM
    
        26.07.2009
     IEEE  | 
| Series | ACM Conferences | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 9781605584973 1605584975  | 
| ISSN | 0738-100X | 
| DOI | 10.1145/1629911.1630119 | 
Cover
| Abstract | Process scaling has given designers billions of transistors to work with. As feature sizes near the atomic scale, extensive variation and wearout inevitably make margining uneconomical or impossible. The ElastIC project seeks to address this by creating a large-scale chip-multiprocessor that can self-diagnose, adapt, and heal. Creating large, flexible designs in this environment naturally lends itself to the repetitive nature of network-on-chip (NoC), but the loss of a single link or router will result in complete network failure. In this work we present Vicis, an ElastIC-style NoC that can tolerate the loss of many network components due to wearout induced hard faults. Vicis uses the inherent redundancy in the network and its routers in order to maintain correct operation while incurring a much lower area overhead than previously proposed N-modular redundancy (NMR) based solutions. Each router has a built-in-self-test (BIST) that diagnoses the locations of hard fault and runs a number of algorithms to best use ECC, port swapping, and a crossbar bypass bus to mitigate them. The routers work together to run distributed algorithms to solve network-wide problems as well, protecting the networking against critical failures in individual routers. In this work we show that with stuck-at fault rates as high as 1 in 2000 gates, Vicis will continue to operate with approximately half of its routers still functional and communicating. | 
    
|---|---|
| AbstractList | Process scaling has given designers billions of transistors to work with. As feature sizes near the atomic scale, extensive variation and wearout inevitably make margining uneconomical or impossible. The ElastIC project seeks to address this by creating a large-scale chip-multiprocessor that can self-diagnose, adapt, and heal. Creating large, flexible designs in this environment naturally lends itself to the repetitive nature of network-on-chip (NoC), but the loss of a single link or router will result in complete network failure. In this work we present Vicis, an ElastIC-style NoC that can tolerate the loss of many network components due to wearout induced hard faults. Vicis uses the inherent redundancy in the network and its routers in order to maintain correct operation while incurring a much lower area overhead than previously proposed N-modular redundancy (NMR) based solutions. Each router has a built-in-self-test (BIST) that diagnoses the locations of hard fault and runs a number of algorithms to best use ECC, port swapping, and a crossbar bypass bus to mitigate them. The routers work together to run distributed algorithms to solve network-wide problems as well, protecting the networking against critical failures in individual routers. In this work we show that with stuck-at fault rates as high as 1 in 2000 gates, Vicis will continue to operate with approximately half of its routers still functional and communicating. | 
    
| Author | DeOrio, Andrew Bertacco, Valeria Sylvester, Dennis Fick, David Blaauw, David Hu, Jin  | 
    
| Author_xml | – sequence: 1 givenname: David surname: Fick fullname: Fick, David organization: University of Michigan, Ann Arbor, MI – sequence: 2 givenname: Andrew surname: DeOrio fullname: DeOrio, Andrew organization: University of Michigan, Ann Arbor, MI – sequence: 3 givenname: Jin surname: Hu fullname: Hu, Jin organization: University of Michigan, Ann Arbor, MI – sequence: 4 givenname: Valeria surname: Bertacco fullname: Bertacco, Valeria organization: University of Michigan, Ann Arbor, MI – sequence: 5 givenname: David surname: Blaauw fullname: Blaauw, David organization: University of Michigan, Ann Arbor, MI – sequence: 6 givenname: Dennis surname: Sylvester fullname: Sylvester, Dennis organization: University of Michigan, Ann Arbor, MI  | 
    
| BookMark | eNqNj7FOwzAURS3RSC0lMwM_wJLwnv3sZ4-oAopUiYVWbFbs2FKANihh4e8Jaj6A6Q7n3iudS7E49ackxDVCjUj6Do10DrFGowDRXYjSsUUDWltyrBZiBaxshQBvhVhabYicpKUox_EdYJowsdUrURy62I1XosjN55jKOddi__jwutlWu5en5839rmok8XdlHVNuc6bAFjhoha2ObRtIRxkjmyZbFcC1SQUGg5IMZxOcCjYqDS6qtbg5_3YpJf81dMdm-PFaSgatJnp7pk08-tD3H6NH8H-6ftb1s-5Urf9Z9WHoUla_ejxPnw | 
    
| ContentType | Conference Proceeding | 
    
| Copyright | 2009 ACM | 
    
| Copyright_xml | – notice: 2009 ACM | 
    
| DBID | 6IE 6IH CBEJK RIE RIO  | 
    
| DOI | 10.1145/1629911.1630119 | 
    
| DatabaseName | IEEE Electronic Library (IEL) Conference Proceedings IEEE Proceedings Order Plan (POP) 1998-present by volume IEEE Xplore All Conference Proceedings IEEE Xplore IEEE Proceedings Order Plans (POP) 1998-present  | 
    
| DatabaseTitleList | |
| Database_xml | – sequence: 1 dbid: RIE name: IEEE Electronic Library (IEL) url: https://proxy.k.utb.cz/login?url=https://ieeexplore.ieee.org/ sourceTypes: Publisher  | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Applied Sciences Engineering  | 
    
| EndPage | 817 | 
    
| ExternalDocumentID | 5227053 | 
    
| Genre | orig-research | 
    
| GroupedDBID | 6IE 6IF 6IG 6IH 6IK 6IL 6IM 6IN AAJGR AARBI ACM ADPZR ALMA_UNASSIGNED_HOLDINGS APO BEFXN BFFAM BGNUA BKEBE BPEOZ CBEJK GUFHI IERZE OCL RIE RIL RIO 123 29O AAWTH ACGFS ADZIZ CHZPO IEGSK IJVOP IPLJI M43 RNS  | 
    
| ID | FETCH-LOGICAL-a247t-8974fdff4b7807b531d5cddb45c2cc76af83b09de3b70612467f6b93b8c3509c3 | 
    
| IEDL.DBID | RIE | 
    
| ISBN | 9781605584973 1605584975  | 
    
| ISSN | 0738-100X | 
    
| IngestDate | Wed Aug 27 02:18:45 EDT 2025 Wed Jan 31 06:44:09 EST 2024 Wed Jan 31 06:39:35 EST 2024  | 
    
| IsPeerReviewed | false | 
    
| IsScholarly | true | 
    
| Keywords | fault tolerance torus Network-on-Chip hard faults N-modular redundancy reconfiguration built-in-self-test  | 
    
| LCCN | 85644924 | 
    
| Language | English | 
    
| License | Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org | 
    
| LinkModel | DirectLink | 
    
| MeetingName | DAC '09: The 46th Annual Design Automation Conference 2009 | 
    
| MergedId | FETCHMERGED-LOGICAL-a247t-8974fdff4b7807b531d5cddb45c2cc76af83b09de3b70612467f6b93b8c3509c3 | 
    
| PageCount | 6 | 
    
| ParticipantIDs | ieee_primary_5227053 acm_books_10_1145_1629911_1630119_brief acm_books_10_1145_1629911_1630119  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 20090726 2009-July  | 
    
| PublicationDateYYYYMMDD | 2009-07-26 2009-07-01  | 
    
| PublicationDate_xml | – month: 07 year: 2009 text: 20090726 day: 26  | 
    
| PublicationDecade | 2000 | 
    
| PublicationPlace | New York, NY, USA | 
    
| PublicationPlace_xml | – name: New York, NY, USA | 
    
| PublicationSeriesTitle | ACM Conferences | 
    
| PublicationTitle | 2009 46th ACM/IEEE Design Automation Conference | 
    
| PublicationTitleAbbrev | DAC | 
    
| PublicationYear | 2009 | 
    
| Publisher | ACM IEEE  | 
    
| Publisher_xml | – name: ACM – name: IEEE  | 
    
| SSID | ssj0001174785 ssj0004161  | 
    
| Score | 2.1783345 | 
    
| Snippet | Process scaling has given designers billions of transistors to work with. As feature sizes near the atomic scale, extensive variation and wearout inevitably... | 
    
| SourceID | ieee acm  | 
    
| SourceType | Publisher | 
    
| StartPage | 812 | 
    
| SubjectTerms | Built-in-Self-Test Electric breakdown Fault diagnosis Fault tolerance Hard Faults N-Modular Redundancy Network-on-a-chip Network-on-Chip Power system management Reconfiguration Redundancy Silicon System recovery Telecommunication traffic Testing Torus  | 
    
| Subtitle | a reliable network for unreliable silicon | 
    
| Title | Vicis | 
    
| URI | https://ieeexplore.ieee.org/document/5227053 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PS8MwFH7MnfQy3SbOX0QQvNiuXZKm9SbiGMLEg5PdSpMmUJydrO3Fv96kzTYVQW9NyCF9vLz3JXnfF4BLknCPCcIdJgR1CIl8J0lp5EQyUWqkvdpLDTl5-hhMZuRhTuctuN5wYaSUdfGZdM1nfZefLkVljsqGGisw7TQ7sMPCoOFqbc9TfKMEv4W-BrjXEpzYyJd6c0Pq0tBd59uIUav1tG5jq_njEzr0Ax2gfd_VOMXIoZmkJd6-Pb1SZ55xB6brOTcFJ69uVXJXfPyQc_zvT-1Df8vxQ0-b7HUALZl3oWNBKbJLvujC3hfFwh54L5nIiht0i1ZykRnaFcqbQnKk0S-q8k13kS20k-V9mI3vn-8mjn10wUlGhJVOqDcYKlWKcBZ6jOslmlKRppxQMRKCBYkKMfeiVGLODDzSgVYFPMI8FFiDD4EPoZ0vc3kESCiOI0V8xrAgUmAdSpmKaEKVAU0BG8CFtm1sdhNF3BCkaWztH1v7D-DqzzExX2VSDaBnjBu_NyodsbXr8e_dJ7Db3AuZwttTaJerSp5peFHy89qvPgHt8MGT | 
    
| linkProvider | IEEE | 
    
| linkToHtml | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwjV3PT8IwFH5BPKgXFDDiz5mYeHFjo-26eTNGggrEAxhuy9q1ySIOw7aLf73tNkGNid7Wpofu5fW9r-37vgJc4JDZlGNmUs6JibHvmGFEfNMXoZQ95dV2pMnJo7E7mOKHGZnV4GrFhRFCFMVnwtKfxV1-tOC5PirrKqxAldNswCbBGJOSrbU-UXG0Fvwa_GroXohwIi1gas80rUuBd5VxfUoqtafPNqpUfxxMuo6rQrTjWAqpaEE0nbb467fHV4rc02_A6HPWZcnJi5VnzOLvPwQd__tbu9Bes_yMp1X-2oOaSJrQqGCpUS36tAk7XzQLW2A_xzxOr40bYynmsSZeGUlZSm4o_Gvkyao7jefKzZI2TPt3k9uBWT27YIY9TDPTU1sMGUmJGfVsytQijQiPIoYJ73FO3VB6iNl-JBCjGiCpUCtd5iPmcaTgB0f7UE8WiTgAg0uGfIkdShHHgiMVTKn0SUikhk0u7cC5sm2g9xNpUFKkSVDZP6js34HLP8cEbBkL2YGWNm7wVup0BJVdD3_vPoOtwWQ0DIb348cj2C5viXQZ7jHUs2UuThTYyNhp4WMfS6jE4A | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Abook&rft.genre=proceeding&rft.title=2009+46th+ACM%2FIEEE+Design+Automation+Conference&rft.atitle=Vicis%3A+A+reliable+network+for+unreliable+silicon&rft.au=Fick%2C+D.&rft.au=DeOrio%2C+A.&rft.au=Jin+Hu&rft.au=Bertacco%2C+V.&rft.date=2009-07-01&rft.pub=IEEE&rft.isbn=9781605584973&rft.issn=0738-100X&rft.spage=812&rft.epage=817&rft_id=info:doi/10.1145%2F1629911.1630119&rft.externalDocID=5227053 | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=0738-100X&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=0738-100X&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=0738-100X&client=summon |