Inferring software behavioral models with MapReduce

In the real world practice, software systems are often built without developing any explicit upfront model. This can cause serious problems that may hinder the almost inevitable future evolution, since at best the only documentation about the software is in the form of source code comments. To addre...

Full description

Saved in:
Bibliographic Details
Published inScience of computer programming Vol. 145; pp. 13 - 36
Main Authors Luo, Chen, He, Fei, Ghezzi, Carlo
Format Journal Article
LanguageEnglish
Published Elsevier B.V 01.10.2017
Subjects
Online AccessGet full text
ISSN0167-6423
1872-7964
DOI10.1016/j.scico.2017.04.004

Cover

More Information
Summary:In the real world practice, software systems are often built without developing any explicit upfront model. This can cause serious problems that may hinder the almost inevitable future evolution, since at best the only documentation about the software is in the form of source code comments. To address this problem, research has been focusing on automatic inference of models by applying machine learning algorithms to execution logs. However, the logs generated by a real software system may be very large and the inference algorithm can exceed the processing capacity of a single computer. This paper proposes a scalable, general approach to the inference of behavior models that can handle large execution logs via parallel and distributed algorithms implemented using the MapReduce programming model and executed on a cluster of interconnected execution nodes. The approach consists of two distributed phases that perform trace slicing and model synthesis. For each phase, a distributed algorithm using MapReduce is developed. With the parallel data processing capacity of MapReduce, the problem of inferring behavior models from large logs can be efficiently solved. The technique is implemented on top of Hadoop. Experiments on Amazon clusters show efficiency and scalability of our approach. •A distributed trace slicing algorithm using MapReduce.•A distributed model synthesis algorithm using MapReduce.•A novel approach for inferring software behavior models with MapReduce.•Experimental results show promising performance of this approach.
ISSN:0167-6423
1872-7964
DOI:10.1016/j.scico.2017.04.004