A study of pedestrian detection algorithms for use in complex environments
In response to the current challenge of limited accuracy in pedestrian recognition in complex scenarios including varying lighting conditions, dense occlusion, adverse weather conditions, crowded environments, and background interference, we present a novel pedestrian detection model based on enhanc...
Saved in:
Published in | Engineering Research Express Vol. 7; no. 3; pp. 35283 - 35302 |
---|---|
Main Authors | , |
Format | Journal Article |
Language | English |
Published |
IOP Publishing
30.09.2025
|
Subjects | |
Online Access | Get full text |
ISSN | 2631-8695 2631-8695 |
DOI | 10.1088/2631-8695/adfc28 |
Cover
Summary: | In response to the current challenge of limited accuracy in pedestrian recognition in complex scenarios including varying lighting conditions, dense occlusion, adverse weather conditions, crowded environments, and background interference, we present a novel pedestrian detection model based on enhanced You Only Look Once version 11 (YOLO11). These complex scenarios pose significant challenges to existing detection algorithms due to feature ambiguity, target occlusion, and background interference, motivating our research to develop a more robust detection framework capable of maintaining high accuracy under challenging real-world conditions. First, a new structure, designated Efficient Multi-Scale Convolution Better (EMSConvB), is proposed for incorporation into the residual module of the C3k2 structure, replacing the existing convolutional layer. The employment of multi-scale convolution enables the structure to capture a more comprehensive range of spatial information, enabling efficient feature integration through the incorporation of 1 × 1 convolution. This approach not only enhances the capacity for feature expression but also preserves the computational efficiency inherent to the system. Secondly, a Multi-Path Coordinate Attention (MPCA) mechanism is designed and embedded into the C3k2 architecture with the objective of effectively extracting and fusing multi-path features. This mechanism can adaptively focus on spatial and channel information, which consequently serves to enhance the detection process’s reliability and precision. Furthermore, the Context Anchor Attention (CAA) mechanism is added to the backbone network to boost the focus on crucial features and augment the model’s capacity for discerning pedestrian targets. Subsequently, a Convolution and Attention Fusion Module (CAFMFusion) module based on Convolutional Gated Attention Fusion (CGAFusion) is incorporated into the feature fusion network. This module is capable of more accurately capturing global context information in complex background environments, thereby enhancing the recognition capability of pedestrians. Finally, a lightweight detection head, Lightweight Shared Detail-Enhanced Convolution Detection (LSDECD), is devised to enhance feature representation and target localization accuracy, thereby minimizing the accuracy loss while decreasing the parameter count and computations. The findings of the experimental investigation are presented below. The enhanced YOLO11 model achieves a detection precision of 90.8%, which is marginally inferior to that of the original YOLO11. However, its mean detection accuracies have been enhanced from 91.0% and 60.9% to 92.2% and 62.2%, respectively. The results demonstrate the proposed algorithm’s ability to identify pedestrians in complex environments. |
---|---|
Bibliography: | ERX-110174.R1 |
ISSN: | 2631-8695 2631-8695 |
DOI: | 10.1088/2631-8695/adfc28 |