Automatic Target Recognition using Unmanned Aerial Vehicle Images with Proposed YOLOv8-SR and Enhanced Deep Super-Resolution Network
Modern surveillance necessitates the use of automatic target recognition (ATR) to identify targets or objects quickly and accurately for multiclass classification in unmanned aerial vehicles (UAVs) such as pedestrians, people, bicycles, cars, vans, trucks, tricycles, buses, and motors. The inadequat...
Saved in:
| Published in | Journal of electronics, electromedical engineering, and medical informatics Vol. 7; no. 4; pp. 1240 - 1258 |
|---|---|
| Main Authors | , , |
| Format | Journal Article |
| Language | English |
| Published |
15.10.2025
|
| Online Access | Get full text |
| ISSN | 2656-8632 2656-8632 |
| DOI | 10.35882/jeeemi.v7i4.888 |
Cover
| Summary: | Modern surveillance necessitates the use of automatic target recognition (ATR) to identify targets or objects quickly and accurately for multiclass classification in unmanned aerial vehicles (UAVs) such as pedestrians, people, bicycles, cars, vans, trucks, tricycles, buses, and motors. The inadequate recognition rate in target detection for UAVs could be due to the fundamental issues provided by the poor resolution of photos recorded from the distinct perspective of the UAVs. The VisDrone dataset used for image analysis consists of a total of 10,209 UAV photos. This research work presents a comprehensive framework specifically for multiclass target classification using VisDrone UAV imagery. The YOLOv8-SR, which stands for "You Only Looked Once Version 8 with Super-Resolution," is a developed model that builds on the YOLOv8s model with the Enhanced Deep Super-Resolution Network (EDSR). The YOLOv8-SR uses the EDSR to convert the low-resolution image to a high-resolution image, allowing it to estimate pixel values for better processing better. The high-resolution image was generated by the EDSR model, having a Peak Signal-to-Noise Ratio (PSNR) of 25.32 and a Structural Similarity Index (SSIM) of 0.781. The YOLOv8-SR model's precision is 63.44%, recall is 46.64%, F1-score is 52.69%, mean average precision (mAP@50) is 51.58%, and the mAP@50–95 is 50.67% over the range of confidence thresholds. The investigation fundamentally transforms the precision and effectiveness of ATR, indicating a future in which ingenuity overcomes obstacles that were once considered insurmountable. This development is characterized by the use of an improved deep super-resolution network to produce super-resolution images from low-resolution inputs. The YoLov8-SR model, a sophisticated version of the YoLov8s framework, is key to this breakthrough. By amalgamating the EDSR methodology with the advanced YOLOv8-SR framework, the system generates high-resolution images abundant in detail, markedly exceeding the informational quality of their low-resolution versions. |
|---|---|
| ISSN: | 2656-8632 2656-8632 |
| DOI: | 10.35882/jeeemi.v7i4.888 |