An improved you only look once model for the multi-scale steel surface defect detection with multi-level alignment and cross-layer redistribution features
Steel surface defects involve a wide variety of sizes and irregular shapes. The performance of the detection models depends greatly on the effective extraction of the cross-scale features. The sequential feature fusion in the traditional Feature Pyramid Network (FPN) is subject to an information los...
Saved in:
Published in | Engineering applications of artificial intelligence Vol. 145; p. 110214 |
---|---|
Main Authors | , , , |
Format | Journal Article |
Language | English |
Published |
Elsevier Ltd
01.04.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 0952-1976 |
DOI | 10.1016/j.engappai.2025.110214 |
Cover
Summary: | Steel surface defects involve a wide variety of sizes and irregular shapes. The performance of the detection models depends greatly on the effective extraction of the cross-scale features. The sequential feature fusion in the traditional Feature Pyramid Network (FPN) is subject to an information loss during the aggregation and transition across shallow and deep features. The classification and localization of steel surface defects using common You Only Look Once (YOLO) detection model may lead to sub-optimal performance when they are involved with the multi-scale objects. To realize a complete shallow-deep feature representation for different scale steel surface defects, an aggregation-redistribution network is introduced into the YOLO detection model to aggregate and refine features across different levels. In the aggregation sub-network, a Multi-level Alignment Module (MAM) is adopted to address the feature misalignment in FPN by aligning the feature maps from level-wise extractors. Therein, the scale deviation at pixels is compensated by the multiple parallel dilated convolutions. Meanwhile, in the redistribution sub-network, a Fusion-Redistribution Module (FRM) is constructed to impose the global information on the fused multi-level features to steer cross-layerly the generation of prediction feature maps for YOLO. The global feature provides a semantic adaptive weight and serves as a complement on the multi-level features through an attention mechanism. Finally, by incorporating the aggregation-redistribution network into YOLOv5, an improved YOLO detection model, i.e., Aggregation-Redistribution YOLO (ARYOLO), is derived for the steel surface defects. The validation results indicate that ARYOLO has achieved a rather satisfactory detection performance for the defects with large variations in scale and shape. It provides a mean average precision of 80.7% on the North Eastern University Surface Defect Dataset (NEU-DET) and 71.2% on the Global 10 Class Metallic Surface Defect Dataset (GC10-DET), giving a great potential in detection and localization tasks. |
---|---|
ISSN: | 0952-1976 |
DOI: | 10.1016/j.engappai.2025.110214 |