A study of pedestrian detection algorithms for use in complex environments

In response to the current challenge of limited accuracy in pedestrian recognition in complex scenarios including varying lighting conditions, dense occlusion, adverse weather conditions, crowded environments, and background interference, we present a novel pedestrian detection model based on enhanc...

Full description

Saved in:

Bibliographic Details
Published in	Engineering Research Express Vol. 7; no. 3; pp. 35283 - 35302
Main Authors	Liu, Zi-liang, Yaermaimaiti, Yilihamu
Format	Journal Article
Language	English
Published	IOP Publishing 30.09.2025
Subjects	complex environment EMSConvB lightweight detection head LSDECD MPCA attention mechanism pedestrian detection YOLO11
Online Access	Get full text
ISSN	2631-8695 2631-8695
DOI	10.1088/2631-8695/adfc28

Cover

More Information
Summary:	In response to the current challenge of limited accuracy in pedestrian recognition in complex scenarios including varying lighting conditions, dense occlusion, adverse weather conditions, crowded environments, and background interference, we present a novel pedestrian detection model based on enhanced You Only Look Once version 11 (YOLO11). These complex scenarios pose significant challenges to existing detection algorithms due to feature ambiguity, target occlusion, and background interference, motivating our research to develop a more robust detection framework capable of maintaining high accuracy under challenging real-world conditions. First, a new structure, designated Efficient Multi-Scale Convolution Better (EMSConvB), is proposed for incorporation into the residual module of the C3k2 structure, replacing the existing convolutional layer. The employment of multi-scale convolution enables the structure to capture a more comprehensive range of spatial information, enabling efficient feature integration through the incorporation of 1 × 1 convolution. This approach not only enhances the capacity for feature expression but also preserves the computational efficiency inherent to the system. Secondly, a Multi-Path Coordinate Attention (MPCA) mechanism is designed and embedded into the C3k2 architecture with the objective of effectively extracting and fusing multi-path features. This mechanism can adaptively focus on spatial and channel information, which consequently serves to enhance the detection process’s reliability and precision. Furthermore, the Context Anchor Attention (CAA) mechanism is added to the backbone network to boost the focus on crucial features and augment the model’s capacity for discerning pedestrian targets. Subsequently, a Convolution and Attention Fusion Module (CAFMFusion) module based on Convolutional Gated Attention Fusion (CGAFusion) is incorporated into the feature fusion network. This module is capable of more accurately capturing global context information in complex background environments, thereby enhancing the recognition capability of pedestrians. Finally, a lightweight detection head, Lightweight Shared Detail-Enhanced Convolution Detection (LSDECD), is devised to enhance feature representation and target localization accuracy, thereby minimizing the accuracy loss while decreasing the parameter count and computations. The findings of the experimental investigation are presented below. The enhanced YOLO11 model achieves a detection precision of 90.8%, which is marginally inferior to that of the original YOLO11. However, its mean detection accuracies have been enhanced from 91.0% and 60.9% to 92.2% and 62.2%, respectively. The results demonstrate the proposed algorithm’s ability to identify pedestrians in complex environments.
Bibliography:	ERX-110174.R1
ISSN:	2631-8695 2631-8695
DOI:	10.1088/2631-8695/adfc28