A multiobjective evolutionary programming framework for graph-based data mining

Subgraph mining is the process of identifying concepts describing interesting and repetitive subgraphs within graph-based data. The exponential number of possible subgraphs makes the problem very challenging. Existing methods apply a single-objective subgraph search with the view that interesting su...

Full description

Saved in:
Bibliographic Details
Published inInformation sciences Vol. 237; pp. 118 - 136
Main Authors Shelokar, Prakash, Quirin, Arnaud, Cordón, Óscar
Format Journal Article
LanguageEnglish
Published Elsevier Inc 10.07.2013
Subjects
Online AccessGet full text
ISSN0020-0255
1872-6291
DOI10.1016/j.ins.2013.02.014

Cover

More Information
Summary:Subgraph mining is the process of identifying concepts describing interesting and repetitive subgraphs within graph-based data. The exponential number of possible subgraphs makes the problem very challenging. Existing methods apply a single-objective subgraph search with the view that interesting subgraphs are those capable of not merely compressing the data, but also enhancing the interpretation of the data considerably. Usually the methods operate by posing simple constraints (or user-defined thresholds) such as returning all subgraphs whose frequency is above a specified threshold. Such search approach may often return either a large number of solutions in the case of a weakly defined objective or very few in the case of a very strictly defined objective. In this paper, we propose a framework based on multiobjective evolutionary programming to mine subgraphs by jointly maximizing two objectives, support and size of the extracted subgraphs. The proposed methodology is able to discover a nondominated set of interesting subgraphs subject to tradeoff between the two objectives, which otherwise would not be achieved by the single-objective search. Besides, it can use different specific multiobjective evolutionary programming methods. Experimental results obtained by three of the latter methods on synthetically generated as well as real-life graph-based datasets validate the utility of the proposed methodology when benchmarked against classical single-objective methods and their previous, nonevolutionary multiobjective extensions.
ISSN:0020-0255
1872-6291
DOI:10.1016/j.ins.2013.02.014