DeepGun: Deep Feature-Driven One-Class Classifier for Firearm Detection Using Visual Gun Features and Human Body Pose Estimation

The increasing frequency of mass shootings at public events and public buildings underscores the limitations of traditional surveillance systems, which rely on human operators monitoring multiple screens. Delayed response times often hinder security teams from intervening before an attack unfolds. S...

Full description

Saved in:
Bibliographic Details
Published inApplied sciences Vol. 15; no. 11; p. 5830
Main Authors Singh, Harbinder, Deniz, Oscar, Ruiz-Santaquiteria, Jesus, Muñoz, Juan D., Bueno, Gloria
Format Journal Article
LanguageEnglish
Published Basel MDPI AG 01.06.2025
Subjects
Online AccessGet full text
ISSN2076-3417
2076-3417
DOI10.3390/app15115830

Cover

More Information
Summary:The increasing frequency of mass shootings at public events and public buildings underscores the limitations of traditional surveillance systems, which rely on human operators monitoring multiple screens. Delayed response times often hinder security teams from intervening before an attack unfolds. Since firearms are rarely seen in public spaces and constitute anomalous observations, firearm detection can be considered as an anomaly detection (AD) problem, for which one-class classifiers (OCCs) are well-suited. To address this challenge, we propose a holistic firearm detection approach that integrates OCCs with visual hand-held gun features and human pose estimation (HPE). In the first stage, a variational autoencoder (VAE) learns latent representations of firearm-related instances, ensuring that the latent space is dedicated exclusively to the target class. Hand patches of variable sizes are extracted from each frame using body landmarks, dynamically adjusting based on the subject’s distance from the camera. In the second stage, a unified feature vector is generated by integrating VAE-extracted latent features with landmark-based arm positioning features. Finally, an isolation forest (IFC)-based OCC model evaluates this unified feature representation to estimate the probability that a test sample belongs to the firearm-related distribution. By utilizing skeletal representations of human actions, our approach overcomes the limitations of appearance-based gun features extracted by camera, which are often affected by background variations. Experimental results on diverse firearm datasets validate the effectiveness of our anomaly detection approach, achieving an F1-score of 86.6%, accuracy of 85.2%, precision of 95.3%, recall of 74.0%, and average precision (AP) of 83.5%. These results demonstrate the superiority of our method over traditional approaches that rely solely on visual features.
Bibliography:ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
ISSN:2076-3417
2076-3417
DOI:10.3390/app15115830