Active defect discovery: A human-in-the-loop learning method

Unsupervised defect detection methods are applied to an unlabeled dataset by producing a ranked list based on defect scores. Unfortunately, many of the top-ranked instances by unsupervised algorithms are not defects, which leads to high false-positive rates. Active Defect Discovery (ADD) is proposed...

Full description

Saved in:
Bibliographic Details
Published inIIE transactions Vol. ahead-of-print; no. ahead-of-print; pp. 1 - 14
Main Authors Shen, Bo, Kong, Zhenyu (James)
Format Journal Article
LanguageEnglish
Published Abingdon Taylor & Francis 02.06.2024
Taylor & Francis Ltd
Subjects
Online AccessGet full text
ISSN2472-5854
2472-5862
DOI10.1080/24725854.2023.2224854

Cover

More Information
Summary:Unsupervised defect detection methods are applied to an unlabeled dataset by producing a ranked list based on defect scores. Unfortunately, many of the top-ranked instances by unsupervised algorithms are not defects, which leads to high false-positive rates. Active Defect Discovery (ADD) is proposed to overcome this deficiency, which sequentially selects instances to get the labeling information (defects or not). However, labeling is often costly. Therefore, balancing detection accuracy and labeling cost is essential. Along this line, this article proposes a novel ADD method to achieve the goal. Our approach is based on the state-of-the-art unsupervised defect detection method, namely, Isolation Forest, as the baseline defect detector to extract features. Thereafter, the sparsity of the extracted features is utilized to adjust the defect detector so that it can focus on more important features for defect detection. To enforce the sparsity of the features and subsequent improvement of the detection accuracy, a new algorithm based on online gradient descent, namely, Sparse Approximated Linear Defect Discovery (SALDD), is proposed with its theoretical Regret analysis. Extensive experiments are conducted on real-world datasets including healthcare, manufacturing, security, etc. The performance demonstrates that the proposed algorithm significantly outperforms the state-of-the-art algorithms for defect detection.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2472-5854
2472-5862
DOI:10.1080/24725854.2023.2224854