Perturbation-based Fault Screening
Fault screeners are a new breed of fault identification technique that can probabilistically detect if a transient fault has affected the state of a processor. We demonstrate that fault screeners function because of two key characteristics. First, we show that much of the intermediate data generated...
        Saved in:
      
    
          | Published in | 2007 IEEE 13th International Symposium on High Performance Computer Architecture pp. 169 - 180 | 
|---|---|
| Main Authors | , , , | 
| Format | Conference Proceeding | 
| Language | English | 
| Published | 
            IEEE
    
        01.02.2007
     | 
| Subjects | |
| Online Access | Get full text | 
| ISBN | 9781424408047 1424408040  | 
| ISSN | 1530-0897 | 
| DOI | 10.1109/HPCA.2007.346195 | 
Cover
| Summary: | Fault screeners are a new breed of fault identification technique that can probabilistically detect if a transient fault has affected the state of a processor. We demonstrate that fault screeners function because of two key characteristics. First, we show that much of the intermediate data generated by a program inherently falls within certain consistent bounds. Second, we observe that these bounds are often violated by the introduction of a fault. Thus, fault screeners can identify faults by directly watching for any data inconsistencies arising in an application's behavior. We present an idealized algorithm capable of identifying over 85% of injected faults on the SpecInt suite and over 75% overall. Further, in a realistic implementation on a simulated Pentium-III-like processor, about half of the errors due to injected faults are identified while still in speculative state. Errors detected this early can be eliminated by a pipeline flush. In this paper, we present several hardware-based versions of this screening algorithm and show that flushing the pipeline every time the hardware screener triggers reduces overall performance by less than 1% | 
|---|---|
| ISBN: | 9781424408047 1424408040  | 
| ISSN: | 1530-0897 | 
| DOI: | 10.1109/HPCA.2007.346195 |