A Metrics-Based Data Mining Approach for Software Clone Detection

The detection of function clones in software systems is valuable for the code adaptation and error checking maintenance activities. This paper presents an efficient metrics-based data mining clone detection approach. First, metrics are collected for all functions in the software system. A data minin...

Full description

Saved in:
Bibliographic Details
Published in2012 IEEE 36th Annual Computer Software and Applications Conference pp. 35 - 41
Main Author Abd-El-Hafiz, S. K.
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.07.2012
Subjects
Online AccessGet full text
ISBN9781467319904
1467319902
ISSN0730-3157
DOI10.1109/COMPSAC.2012.14

Cover

More Information
Summary:The detection of function clones in software systems is valuable for the code adaptation and error checking maintenance activities. This paper presents an efficient metrics-based data mining clone detection approach. First, metrics are collected for all functions in the software system. A data mining algorithm, fractal clustering, is then utilized to partition the software system into a relatively small number of clusters. Each of the resulting clusters encapsulates functions that are within a specific proximity of each other in the metrics space. Finally, clone classes, rather than pairs, are easily extracted from the resulting clusters. For large software systems, the approach is very space efficient and linear in the size of the data set. Evaluation is performed using medium and large open source software systems. In this evaluation, the effect of the chosen metrics on the detection precision is investigated.
ISBN:9781467319904
1467319902
ISSN:0730-3157
DOI:10.1109/COMPSAC.2012.14