PerfAugur: Robust diagnostics for performance anomalies in cloud services

Cloud platforms involve multiple independently developed components, often executing on diverse hardware configurations and across multiple data centers. This complexity makes tracking various key performance indicators (KPIs) and manual diagnosing of anomalies in system behavior both difficult and...

Full description

Saved in:
Bibliographic Details
Published in2015 IEEE 31st International Conference on Data Engineering pp. 1167 - 1178
Main Authors Roy, Sudip, Konig, Arnd Christian, Dvorkin, Igor, Kumar, Manish
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.04.2015
Subjects
Online AccessGet full text
ISSN1063-6382
DOI10.1109/ICDE.2015.7113365

Cover

More Information
Summary:Cloud platforms involve multiple independently developed components, often executing on diverse hardware configurations and across multiple data centers. This complexity makes tracking various key performance indicators (KPIs) and manual diagnosing of anomalies in system behavior both difficult and expensive. In this paper, we describe PerfAugur, an automated system for mining service logs to identify anomalies and help formulate data-driven hypotheses. PerfAugur includes a suite of efficient mining algorithms for detecting significant anomalies in system behavior, along with potential explanations for such anomalies, without the need for an explicit supervision signal. We perform extensive experimental evaluation using both synthetic and real-life data sets, and present detailed case studies showing the impact of this technology on operations of the Windows Azure Service.
ISSN:1063-6382
DOI:10.1109/ICDE.2015.7113365