Perturbation-based Fault Screening

Fault screeners are a new breed of fault identification technique that can probabilistically detect if a transient fault has affected the state of a processor. We demonstrate that fault screeners function because of two key characteristics. First, we show that much of the intermediate data generated...

Full description

Saved in:

Bibliographic Details
Published in	2007 IEEE 13th International Symposium on High Performance Computer Architecture pp. 169 - 180
Main Authors	Racunas, P., Constantinides, K., Manne, S., Mukherjee, S.S.
Format	Conference Proceeding
Language	English
Published	IEEE 01.02.2007
Subjects	Circuit faults CMOS technology Costs Error analysis Fault diagnosis Fault tolerance Hardware Microprocessors Pipelines Protection
Online Access	Get full text
ISBN	9781424408047 1424408040
ISSN	1530-0897
DOI	10.1109/HPCA.2007.346195

Cover

More Information
Summary:	Fault screeners are a new breed of fault identification technique that can probabilistically detect if a transient fault has affected the state of a processor. We demonstrate that fault screeners function because of two key characteristics. First, we show that much of the intermediate data generated by a program inherently falls within certain consistent bounds. Second, we observe that these bounds are often violated by the introduction of a fault. Thus, fault screeners can identify faults by directly watching for any data inconsistencies arising in an application's behavior. We present an idealized algorithm capable of identifying over 85% of injected faults on the SpecInt suite and over 75% overall. Further, in a realistic implementation on a simulated Pentium-III-like processor, about half of the errors due to injected faults are identified while still in speculative state. Errors detected this early can be eliminated by a pipeline flush. In this paper, we present several hardware-based versions of this screening algorithm and show that flushing the pipeline every time the hardware screener triggers reduces overall performance by less than 1%
ISBN:	9781424408047 1424408040
ISSN:	1530-0897
DOI:	10.1109/HPCA.2007.346195