Fault Monitoring with Sequential Matrix Factorization

For real-world distributed systems, the knowledge component at the core of the MAPE-K loop has to be inferred, as it cannot be realistically assumed to be defined a priori. Accordingly, this paper considers fault monitoring as a latent factors discovery problem. In the context of end-to-end probing,...

Full description

Saved in:
Bibliographic Details
Published inACM transactions on autonomous and adaptive systems Vol. 10; no. 3; pp. 1 - 25
Main Authors Feng, Dawei, Germain, Cecile
Format Journal Article
LanguageEnglish
Published Association for Computing Machinery (ACM) 01.10.2015
Subjects
Online AccessGet full text
ISSN1556-4665
1556-4703
1556-4703
DOI10.1145/2797141

Cover

More Information
Summary:For real-world distributed systems, the knowledge component at the core of the MAPE-K loop has to be inferred, as it cannot be realistically assumed to be defined a priori. Accordingly, this paper considers fault monitoring as a latent factors discovery problem. In the context of end-to-end probing, the goal is to devise an efficient sampling policy that makes the best use of a constrained sampling budget. Previous work addresses fault monitoring in a collaborative prediction framework, where the information is a snapshot of the probes outcomes. Here, we take into account the fact that the system dynamically evolves at various time scales. We propose and evaluate Sequential Matrix Factorization (SMF) that exploits both the recent advances in matrix factorization for the instantaneous information and a new sampling heuristics based on historical information. The effectiveness of the SMF approach is exemplified on datasets of increasing difficulty and compared with state of the art history-based or snapshot-based methods. In all cases, strong adaptivity under the specific flavor of active learning is required to unleash the full potential of coupling the most confident and the most uncertain sampling heuristics, which is the cornerstone of SMF.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 23
ISSN:1556-4665
1556-4703
1556-4703
DOI:10.1145/2797141