An Empirical Study of the Bug Link Rate
Defect data is critical for software defect prediction. To collect defect data, it is essential to establish links between bugs and their fixes. Missing links (i.e. low link rate) can cause false negatives in the defect dataset, and bias the experimental results. Despite the importance of bug links,...
Saved in:
Published in | IEEE International Conference on Software Quality, Reliability and Security (Online) pp. 177 - 188 |
---|---|
Main Authors | , , |
Format | Conference Proceeding |
Language | English |
Published |
IEEE
01.12.2022
|
Subjects | |
Online Access | Get full text |
ISSN | 2693-9177 |
DOI | 10.1109/QRS57517.2022.00028 |
Cover
Summary: | Defect data is critical for software defect prediction. To collect defect data, it is essential to establish links between bugs and their fixes. Missing links (i.e. low link rate) can cause false negatives in the defect dataset, and bias the experimental results. Despite the importance of bug links, little prior work has used bug link rate as a criterion for selecting subjects, and there is no empirical evidence to know whether there are simpler alternative criteria for evaluating a project's link rate to aid selection. To this end, we conduct a comprehensive study on the bug link rate. Based on 34 open-source projects, we make a detailed statistical analysis of the actual link rates of the projects, and examine the factors affecting link rates from both quantitative and qualitative perspectives. The findings could improve the understanding of bug link rates, and guide the selection of better subjects for defect prediction. |
---|---|
ISSN: | 2693-9177 |
DOI: | 10.1109/QRS57517.2022.00028 |