On-Line Fault Monitoring
Sequoia's fault-tolerant computers were designed subject to some rather rigid constraints: No single hardware malfunction can generate an undetected error; an integrated circuit is a "black box" that can fail in arbitrary ways, affecting an arbitrary subset of input and output signals...
Saved in:
| Published in | Journal of electronic testing Vol. 12; no. 1-2; pp. 21 - 27 |
|---|---|
| Main Author | |
| Format | Journal Article |
| Language | English |
| Published |
Boston
Springer Nature B.V
01.02.1998
|
| Subjects | |
| Online Access | Get full text |
| ISSN | 0923-8174 1573-0727 |
| DOI | 10.1023/A:1008201032535 |
Cover
| Summary: | Sequoia's fault-tolerant computers were designed subject to some rather rigid constraints: No single hardware malfunction can generate an undetected error; an integrated circuit is a "black box" that can fail in arbitrary ways, affecting an arbitrary subset of input and output signals; faults can be transient or intermittent with arbitrary durations and repetition intervals. Moreover, the incremental hardware to be used to achieve these goals was to be kept to a minimum. The resulting computers do, to a very large extent, satisfy these constraints. To achieve this, a combination of fault-monitoring techniques was used, including: Bit and nibble error-correcting and error-detecting codes; byte parity codes with orthogonal partitioning; cyclic-residue codes on I/O data transfers; codes designed to protect against address counter overruns on I/O transfers; lossless control-signal compactors. The nature and rationale for these various fault monitors is described as well as the analytical and testing techniques used to estimate the resulting coverage.[PUBLICATION ABSTRACT] |
|---|---|
| Bibliography: | SourceType-Scholarly Journals-1 ObjectType-Feature-1 content type line 14 ObjectType-Article-2 content type line 23 |
| ISSN: | 0923-8174 1573-0727 |
| DOI: | 10.1023/A:1008201032535 |