An Empirical Study of the Bug Link Rate

Defect data is critical for software defect prediction. To collect defect data, it is essential to establish links between bugs and their fixes. Missing links (i.e. low link rate) can cause false negatives in the defect dataset, and bias the experimental results. Despite the importance of bug links,...

Full description

Saved in:
Bibliographic Details
Published inIEEE International Conference on Software Quality, Reliability and Security (Online) pp. 177 - 188
Main Authors Li, Chenglin, Zhao, Yangyang, Yang, Yibiao
Format Conference Proceeding
LanguageEnglish
Published IEEE 01.12.2022
Subjects
Online AccessGet full text
ISSN2693-9177
DOI10.1109/QRS57517.2022.00028

Cover

More Information
Summary:Defect data is critical for software defect prediction. To collect defect data, it is essential to establish links between bugs and their fixes. Missing links (i.e. low link rate) can cause false negatives in the defect dataset, and bias the experimental results. Despite the importance of bug links, little prior work has used bug link rate as a criterion for selecting subjects, and there is no empirical evidence to know whether there are simpler alternative criteria for evaluating a project's link rate to aid selection. To this end, we conduct a comprehensive study on the bug link rate. Based on 34 open-source projects, we make a detailed statistical analysis of the actual link rates of the projects, and examine the factors affecting link rates from both quantitative and qualitative perspectives. The findings could improve the understanding of bug link rates, and guide the selection of better subjects for defect prediction.
ISSN:2693-9177
DOI:10.1109/QRS57517.2022.00028