Predictors of Software Metric Correlation: A Non-parametric Analysis

A number of authors hypothesize and experimentally confirm that Cyclomatic Complexity (CC) has a very strong correlation with Lines of Code (LOC), justifying the use of LOC in place of CC. Others report on a moderate correlation and advocate for the use of both metrics. These studies have, for the m...

Full description

Saved in:

Bibliographic Details
Published in	IEEE International Conference on Software Quality, Reliability and Security (Online) pp. 524 - 533
Main Authors	Afriyie, Daniel, Labiche, Yvan
Format	Conference Proceeding
Language	English
Published	IEEE 01.12.2021
Subjects	Code Complexity Codes Correlation Cyclomatic Complexity Industries Lines of Code Measurement Production Production Code Software metrics Software quality Spearman Correlation Test Code
Online Access	Get full text
ISSN	2693-9177
DOI	10.1109/QRS54544.2021.00063

Cover

More Information
Summary:	A number of authors hypothesize and experimentally confirm that Cyclomatic Complexity (CC) has a very strong correlation with Lines of Code (LOC), justifying the use of LOC in place of CC. Others report on a moderate correlation and advocate for the use of both metrics. These studies have, for the most part, studied production code, and we suspect different results may be observed for test code. With 40 different, large open-source subjects and five subjects from industry partners, we collected metric values for LOC, CC and Halstead Effort (HE) and measured their correlation. In test code, contrary to production code, there exist a very weak (or almost no) correlation between (a) LOC and CC, and weak (nearly moderate) correlation for (b) HE and CC, and (c) LOC and HE. We therefore argue and propose that the level of correlation depends on at least three factors namely: the kind of code (i.e., production code vs test code), the kind of software (open-source vs industry) and the kind of metric (LOC, CC, HE). Given the weak monotonicity between CC, LOC and HE we observe, we aspire to challenge the viewpoint that CC and Halstead metrics are redundant with LOC, as some studies suggest, at least on test code. We therefore advocate for using CC over LOC (or both, or cyclomatic density) when studying test code, as CC is perceived to better reflect cognitive complexity, numerical complexity, interdependency and code refactoring that cannot be accounted for simply by LOC.
ISSN:	2693-9177
DOI:	10.1109/QRS54544.2021.00063