Deep learning-powered visual inspection for metal surfaces – Impact of annotations on algorithms based on defect characteristics

In general, the labeling process provides a set of annotations that are used for supervised learning. A major assumption of this process is that each annotation represents the ground truth about an observed phenomenon, which is defined by manually labeling it. While most extant Deep Learning (DL) re...

Full description

Saved in:

Bibliographic Details
Published in	Advanced engineering informatics Vol. 62; p. 102727
Main Authors	Dubey, Pallavi, Miller, Seth, Elçin Günay, Elif, Jackman, John, Kremer, Gül E., Kremer, Paul A.
Format	Journal Article
Language	English
Published	Elsevier Ltd 01.10.2024
Subjects	Computer vision Deep learning Repeatability and reproducibility in machine learning YOLOv5s Deep learning Computer vision Repeatability and reproducibility in machine learning YOLOv5s
Online Access	Get full text
ISSN	1474-0346
DOI	10.1016/j.aei.2024.102727

Cover

More Information
Summary:	In general, the labeling process provides a set of annotations that are used for supervised learning. A major assumption of this process is that each annotation represents the ground truth about an observed phenomenon, which is defined by manually labeling it. While most extant Deep Learning (DL) research is focused on improving the accuracy and efficiency of training and inferencing algorithms, only limited attention has been paid to data validation. Potential inconsistencies in the labeling process for DL are in this less investigated category. This study assessed the performance of You Only Look Once version 5 small (YOLOv5s) using confidence intervals (CIs) for each defect type in a metal defect benchmark dataset, GC10 DET. The impacts of standardizing the labeling process and the role of consistency in labeling were evaluated through an experimental study. The results showed that individually labeled small-size defects with precise bounding boxes perform better than defects labeled inconsistently in a group. Improved data validation through precise labeling increased average precision (AP) by 12.26–25.78 % across defect categories. This overall result points to the need for further evaluation of an image dataset through data validation before comparing algorithms on a benchmark dataset and using bootstrap CIs when categories have limited data.
ISSN:	1474-0346
DOI:	10.1016/j.aei.2024.102727