Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s

The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive...

Full description

Saved in:
Bibliographic Details
Published inThe Journal of supercomputing Vol. 81; no. 4; p. 545
Main Authors Sun, Yuheng, Lan, Zhenping, Sun, Yanguo, Guo, Yuepeng, Li, Xinxin, Wang, Yuru, Meng, Yuwei
Format Journal Article
LanguageEnglish
Published New York Springer Nature B.V 01.03.2025
Subjects
Online AccessGet full text
ISSN1573-0484
0920-8542
1573-0484
DOI10.1007/s11227-025-07067-3

Cover

Abstract The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive. This paper presents a novel YOLO-based method for detecting small targets, specifically tailored to UAV photography. Firstly, a detection head is formulated for small targets to provide higher-resolution feature mapping. Secondly, a three-scale feature fusion module is proposed as a means of fusing the network features with the underlying features. This is intended to improve the deep semantic feature fusion and shallow texture feature fusion, provide rich spatial information for different detection heads and address the issue of feature loss. Furthermore, a module for Feature Selection Guidance Module is proposed, which enhances the ability to discriminate small targets by combining the CNN and the nonlinear learning operator. Finally, Soft_NMS is introduced and combined with DIOU, and the DIOU_Soft_NMS algorithm is proposed as a replacement for the original nonextremely large value suppression method. This new algorithm solves target crowding effectively and overlapping. Experimental results show that exhibits superior detection performance in UAV aerial photography scenarios, achieving remarkable outcomes on the VisDrone2019 dataset. In the test set, mAP0.5 reached 45%, representing a 12.1% improvement in comparison with YOLOv8, while mAP0.5 − 0.95 reached 34.1%, indicating an 11.4% improvement in comparison with YOLOv8. This suggests that the method will have potential for use in practical tasks in the field of UAVs. Furthermore, the results provide a solid foundation for future related research.
AbstractList The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive. This paper presents a novel YOLO-based method for detecting small targets, specifically tailored to UAV photography. Firstly, a detection head is formulated for small targets to provide higher-resolution feature mapping. Secondly, a three-scale feature fusion module is proposed as a means of fusing the network features with the underlying features. This is intended to improve the deep semantic feature fusion and shallow texture feature fusion, provide rich spatial information for different detection heads and address the issue of feature loss. Furthermore, a module for Feature Selection Guidance Module is proposed, which enhances the ability to discriminate small targets by combining the CNN and the nonlinear learning operator. Finally, Soft_NMS is introduced and combined with DIOU, and the DIOU_Soft_NMS algorithm is proposed as a replacement for the original nonextremely large value suppression method. This new algorithm solves target crowding effectively and overlapping. Experimental results show that exhibits superior detection performance in UAV aerial photography scenarios, achieving remarkable outcomes on the VisDrone2019 dataset. In the test set, mAP0.5 reached 45%, representing a 12.1% improvement in comparison with YOLOv8, while mAP0.5 − 0.95 reached 34.1%, indicating an 11.4% improvement in comparison with YOLOv8. This suggests that the method will have potential for use in practical tasks in the field of UAVs. Furthermore, the results provide a solid foundation for future related research.
ArticleNumber 545
Author Li, Xinxin
Guo, Yuepeng
Lan, Zhenping
Meng, Yuwei
Sun, Yuheng
Sun, Yanguo
Wang, Yuru
Author_xml – sequence: 1
  givenname: Yuheng
  surname: Sun
  fullname: Sun, Yuheng
– sequence: 2
  givenname: Zhenping
  surname: Lan
  fullname: Lan, Zhenping
– sequence: 3
  givenname: Yanguo
  surname: Sun
  fullname: Sun, Yanguo
– sequence: 4
  givenname: Yuepeng
  surname: Guo
  fullname: Guo, Yuepeng
– sequence: 5
  givenname: Xinxin
  surname: Li
  fullname: Li, Xinxin
– sequence: 6
  givenname: Yuru
  surname: Wang
  fullname: Wang, Yuru
– sequence: 7
  givenname: Yuwei
  surname: Meng
  fullname: Meng, Yuwei
BookMark eNpNkL1OwzAYRS1UJNrCCzBZYjbY_hLbYUMVf1JFB7owWU5st6nSONguUt-eQBmY7h2O7pXODE360DuErhm9ZZTKu8QY55JQXhIqqZAEztCUlRIILVQx-dcv0CylHaW0AAlT9PaSvSXH0IV7_L43XYeziRuXsXXZNbkNPW57bON4h42LrenwsA05bKIZtkdcm-QsHqGP1XL1pdIlOvemS-7qL-do_fS4XryQ5er5dfGwJA3nIhMLpq4lM8oqKCpVMe8AlHd1CdQYq1TdeKFq6iQIo4ygFausKJpKCG59AXN0c5odYvg8uJT1LhxiPz5q4KUQtFQAI8VPVBNDStF5PcR2b-JRM6p_tOmTNj1q07_aNMA3Zt9hUw
Cites_doi 10.1109/ICRA57147.2024.10610273
10.1016/j.eswa.2022.118665
10.1109/ACCESS.2023.3300372
10.1109/ICCVW54120.2021.00316
10.1109/CVPR52729.2023.00721
10.23919/MVA57639.2023.10215748
10.1109/TPAMI.2016.2577031
10.3390/drones7080526
10.1109/CVPR42600.2020.01261
10.1007/978-3-319-46448-0_2
10.13164/re.2024.0012
10.1016/j.patcog.2023.110041
10.1109/ICCV.2015.169
10.1007/978-3-030-01249-6_23
10.1109/CVPR52729.2023.01291
10.3390/rs15143468
10.1109/WACV48630.2021.00330
10.1109/TGRS.2023.3298852
10.1109/ICCVW54120.2021.00312
10.1007/978-981-13-9042-5_56
10.1016/j.patrec.2023.10.028
10.1109/CVPR.2016.91
10.1109/TIP.2013.2281420
10.1109/WACV48630.2021.00120
10.1016/j.isprsjprs.2023.04.009
10.1609/aaai.v34i07.6999
10.1109/ACCESS.2019.2961959
10.1109/ICCV.2017.324
10.1016/j.jvcir.2023.103936
10.1109/MSP.2017.2765202
10.1109/ICCV.2017.593
10.1007/978-3-031-26409-2_27
10.1109/CVPR.2018.00644
10.1109/CVPR.2017.106
ContentType Journal Article
Copyright The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
Copyright_xml – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025.
DBID AAYXX
CITATION
JQ2
DOI 10.1007/s11227-025-07067-3
DatabaseName CrossRef
ProQuest Computer Science Collection
DatabaseTitle CrossRef
ProQuest Computer Science Collection
DatabaseTitleList ProQuest Computer Science Collection
DeliveryMethod fulltext_linktorsrc
Discipline Computer Science
EISSN 1573-0484
ExternalDocumentID 10_1007_s11227_025_07067_3
GroupedDBID -~C
.4S
.86
.DC
.VR
06D
0R~
0VY
123
199
1N0
203
29L
2J2
2JN
2JY
2KG
2KM
2LR
2~H
30V
4.4
406
408
409
40D
40E
5VS
67Z
6NX
78A
8TC
8UJ
95-
95.
95~
96X
AAAVM
AABHQ
AACDK
AAHNG
AAIAL
AAJBT
AAJKR
AANZL
AAPKM
AARTL
AASML
AATNV
AATVU
AAUYE
AAWCG
AAYIU
AAYQN
AAYXX
AAYZH
ABAKF
ABBBX
ABBRH
ABBXA
ABDBE
ABDBF
ABDZT
ABECU
ABFSG
ABFTD
ABFTV
ABHLI
ABHQN
ABJNI
ABJOX
ABKCH
ABKTR
ABMNI
ABMQK
ABNWP
ABQBU
ABRTQ
ABSXP
ABTEG
ABTHY
ABTKH
ABTMW
ABWNU
ABXPI
ACAOD
ACDTI
ACGFS
ACHSB
ACHXU
ACKNC
ACMDZ
ACMLO
ACOKC
ACOMO
ACPIV
ACSTC
ACZOJ
ADHHG
ADHIR
ADIMF
ADKFA
ADKNI
ADKPE
ADRFC
ADTPH
ADURQ
ADYFF
ADZKW
AEFQL
AEGAL
AEGNC
AEJHL
AEJRE
AEMSY
AENEX
AEOHA
AEPYU
AESKC
AETLH
AEVLU
AEXYK
AEZWR
AFBBN
AFDZB
AFHIU
AFLOW
AFOHR
AFQWF
AFWTZ
AFZKB
AGAYW
AGDGC
AGMZJ
AGQEE
AGQMX
AGRTI
AGWIL
AGWZB
AGYKE
AHAVH
AHBYD
AHPBZ
AHSBF
AHWEU
AHYZX
AIAKS
AIGIU
AIIXL
AILAN
AITGF
AIXLP
AJRNO
AJZVZ
ALMA_UNASSIGNED_HOLDINGS
ALWAN
AMKLP
AMXSW
AMYLF
AMYQR
AOCGG
ARMRJ
ASPBG
ATHPR
AVWKF
AXYYD
AYFIA
AYJHY
AZFZN
B-.
BA0
BGNMA
BSONS
CITATION
CS3
CSCUP
DDRTE
DL5
DNIVK
DPUIP
DU5
EAP
EBLON
EBS
EIOEI
ESBYG
ESX
F5P
FEDTE
FERAY
FFXSO
FIGPU
FNLPD
FRRFC
FWDCC
GGCAI
GGRSB
GJIRD
GNWQR
GQ7
GQ8
GXS
HF~
HG5
HG6
HMJXF
HQYDN
HRMNR
HVGLF
HZ~
I09
IHE
IJ-
IKXTQ
ITM
IWAJR
IXC
IZIGR
IZQ
I~X
I~Z
J-C
J0Z
JBSCW
JCJTX
JZLTJ
KDC
KOV
LAK
LLZTM
M4Y
MA-
N9A
NB0
NPVJJ
NQJWS
NU0
O9-
O93
O9G
O9I
O9J
OAM
P19
P2P
P9O
PF0
PT4
PT5
QOK
QOS
R89
R9I
RHV
ROL
RPX
RSV
S16
S1Z
S27
S3B
SAP
SCJ
SCO
SDH
SDM
SHX
SISQX
SJYHP
SNE
SNPRN
SNX
SOHCF
SOJ
SPISZ
SRMVM
SSLCW
STPWE
SZN
T13
TSG
TSK
TSV
TUC
TUS
U2A
UG4
UOJIU
UTJUX
VC2
W23
W48
WH7
WK8
YLTOR
Z45
ZMTXR
~EX
JQ2
ID FETCH-LOGICAL-c226t-d3abb71a8d8349891fe338feb530aad88bcf68b0e736a8a60919d64c9662df43
ISSN 1573-0484
0920-8542
IngestDate Mon Oct 06 18:35:58 EDT 2025
Wed Oct 01 06:51:13 EDT 2025
IsPeerReviewed true
IsScholarly true
Issue 4
Language English
LinkModel OpenURL
MergedId FETCHMERGED-LOGICAL-c226t-d3abb71a8d8349891fe338feb530aad88bcf68b0e736a8a60919d64c9662df43
Notes ObjectType-Article-1
SourceType-Scholarly Journals-1
ObjectType-Feature-2
content type line 14
PQID 3256605833
PQPubID 2043774
ParticipantIDs proquest_journals_3256605833
crossref_primary_10_1007_s11227_025_07067_3
ProviderPackageCode CITATION
AAYXX
PublicationCentury 2000
PublicationDate 2025-03-01
PublicationDateYYYYMMDD 2025-03-01
PublicationDate_xml – month: 03
  year: 2025
  text: 2025-03-01
  day: 01
PublicationDecade 2020
PublicationPlace New York
PublicationPlace_xml – name: New York
PublicationTitle The Journal of supercomputing
PublicationYear 2025
Publisher Springer Nature B.V
Publisher_xml – name: Springer Nature B.V
References 7067_CR21
S Huang (7067_CR1) 2024; 147
W Fang (7067_CR20) 2019; 8
P Bharati (7067_CR17) 2020; 2019
7067_CR23
7067_CR22
X Li (7067_CR26) 2023; 199
7067_CR29
7067_CR28
C Gao (7067_CR3) 2013; 22
A Creswell (7067_CR24) 2018; 35
7067_CR2
S Ren (7067_CR19) 2016; 39
7067_CR8
7067_CR9
7067_CR10
7067_CR32
7067_CR31
7067_CR12
7067_CR34
7067_CR5
7067_CR11
7067_CR33
7067_CR6
7067_CR14
7067_CR36
7067_CR13
7067_CR35
7067_CR16
7067_CR15
7067_CR37
X Wang (7067_CR4) 2023; 176
7067_CR18
L Zhou (7067_CR30) 2023; 15
J Xiao (7067_CR25) 2023; 211
Z Zhang (7067_CR7) 2023; 7
S Cao (7067_CR27) 2023; 97
References_xml – ident: 7067_CR14
  doi: 10.1109/ICRA57147.2024.10610273
– volume: 211
  start-page: 118665
  year: 2023
  ident: 7067_CR25
  publication-title: Expert Syst Appl
  doi: 10.1016/j.eswa.2022.118665
– ident: 7067_CR35
  doi: 10.1109/ACCESS.2023.3300372
– ident: 7067_CR2
  doi: 10.1109/ICCVW54120.2021.00316
– ident: 7067_CR29
  doi: 10.1109/CVPR52729.2023.00721
– ident: 7067_CR9
  doi: 10.23919/MVA57639.2023.10215748
– volume: 39
  start-page: 1137
  issue: 6
  year: 2016
  ident: 7067_CR19
  publication-title: IEEE Trans Pattern Anal Mach Intell
  doi: 10.1109/TPAMI.2016.2577031
– volume: 7
  start-page: 526
  issue: 8
  year: 2023
  ident: 7067_CR7
  publication-title: Drones
  doi: 10.3390/drones7080526
– ident: 7067_CR11
  doi: 10.1109/CVPR42600.2020.01261
– ident: 7067_CR21
  doi: 10.1007/978-3-319-46448-0_2
– ident: 7067_CR13
  doi: 10.13164/re.2024.0012
– volume: 147
  start-page: 110041
  year: 2024
  ident: 7067_CR1
  publication-title: Pattern Recogn
  doi: 10.1016/j.patcog.2023.110041
– ident: 7067_CR18
  doi: 10.1109/ICCV.2015.169
– ident: 7067_CR5
  doi: 10.1007/978-3-030-01249-6_23
– ident: 7067_CR36
– ident: 7067_CR31
  doi: 10.1109/CVPR52729.2023.01291
– volume: 15
  start-page: 3468
  issue: 14
  year: 2023
  ident: 7067_CR30
  publication-title: Remote Sens
  doi: 10.3390/rs15143468
– ident: 7067_CR6
  doi: 10.1109/WACV48630.2021.00330
– ident: 7067_CR28
  doi: 10.1109/TGRS.2023.3298852
– ident: 7067_CR34
  doi: 10.1109/ICCVW54120.2021.00312
– volume: 2019
  start-page: 657
  year: 2020
  ident: 7067_CR17
  publication-title: Comput Intell Pattern Recogn: Proc CIPR
  doi: 10.1007/978-981-13-9042-5_56
– volume: 176
  start-page: 153
  year: 2023
  ident: 7067_CR4
  publication-title: Pattern Recogn Lett
  doi: 10.1016/j.patrec.2023.10.028
– ident: 7067_CR12
  doi: 10.1109/CVPR.2016.91
– volume: 22
  start-page: 4996
  issue: 12
  year: 2013
  ident: 7067_CR3
  publication-title: IEEE Trans Image Process
  doi: 10.1109/TIP.2013.2281420
– ident: 7067_CR10
  doi: 10.1109/WACV48630.2021.00120
– volume: 199
  start-page: 242
  year: 2023
  ident: 7067_CR26
  publication-title: ISPRS J Photogramm Remote Sens
  doi: 10.1016/j.isprsjprs.2023.04.009
– ident: 7067_CR16
  doi: 10.1609/aaai.v34i07.6999
– volume: 8
  start-page: 1935
  year: 2019
  ident: 7067_CR20
  publication-title: Ieee Access
  doi: 10.1109/ACCESS.2019.2961959
– ident: 7067_CR37
– ident: 7067_CR22
  doi: 10.1109/ICCV.2017.324
– volume: 97
  start-page: 103936
  year: 2023
  ident: 7067_CR27
  publication-title: J Vis Commun Image Represent
  doi: 10.1016/j.jvcir.2023.103936
– volume: 35
  start-page: 53
  issue: 1
  year: 2018
  ident: 7067_CR24
  publication-title: IEEE Signal Process Mag
  doi: 10.1109/MSP.2017.2765202
– ident: 7067_CR15
  doi: 10.1109/ICCV.2017.593
– ident: 7067_CR23
– ident: 7067_CR33
  doi: 10.1007/978-3-031-26409-2_27
– ident: 7067_CR32
  doi: 10.1109/CVPR.2018.00644
– ident: 7067_CR8
  doi: 10.1109/CVPR.2017.106
SSID ssj0004373
Score 2.3709085
Snippet The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges,...
SourceID proquest
crossref
SourceType Aggregation Database
Index Database
StartPage 545
SubjectTerms Accuracy
Aerial photography
Algorithms
Artificial intelligence
Cameras
Drone aircraft
Drones
Feature selection
Localization
Modules
Multisensor fusion
Semantics
Spatial data
Target detection
Target recognition
Unmanned aerial vehicles
Title Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s
URI https://www.proquest.com/docview/3256605833
Volume 81
hasFullText 1
inHoldings 1
isFullTextHit
isPrint
journalDatabaseRights – providerCode: PRVLSH
  databaseName: SpringerLink Journals
  customDbUrl:
  mediaType: online
  eissn: 1573-0484
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0004373
  issn: 1573-0484
  databaseCode: AFBBN
  dateStart: 19970101
  isFulltext: true
  providerName: Library Specific Holdings
– providerCode: PRVAVX
  databaseName: SpringerLINK - Czech Republic Consortium
  customDbUrl:
  eissn: 1573-0484
  dateEnd: 99991231
  omitProxy: false
  ssIdentifier: ssj0004373
  issn: 1573-0484
  databaseCode: AGYKE
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: http://link.springer.com
  providerName: Springer Nature
– providerCode: PRVAVX
  databaseName: SpringerLink Journals (ICM)
  customDbUrl:
  eissn: 1573-0484
  dateEnd: 99991231
  omitProxy: true
  ssIdentifier: ssj0004373
  issn: 1573-0484
  databaseCode: U2A
  dateStart: 19970101
  isFulltext: true
  titleUrlDefault: http://www.springerlink.com/journals/
  providerName: Springer Nature
link http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Jj9MwFLZK58KFHbEMyAfEpfIojZ3E5TaDpqpQaQ9kpA6XyFuYOdCGNkUC8eN5XrIURgi4RJVrua7fl_ee34rQq4hpk3FGSUpFRFg5oURypgkt4e0S41Qlqc13fr9IZxfs3SpZDQY_-tkltTxR32_MK_kfqsIY0NVmyf4DZdtFYQA-A33hCRSG51_ReFaXmnwD9mXv9R8-Wy-zj-weaVMb1cQx6u0GVEnhdjSqrjZ1KFM9siJMW3fB5XK-_Mp3fU21yxlz2upuX5mtci0gGmHnXEmOaV3ur0w3OPc21Y8wVt0w1xpIN23Yz37jF7C9eD_1TRBx0sVgHZggbXy19Xq0KTLezgj3U574ElonJnDZDBbwRUxbNuw7twS4sR5PTXy9yd94fRRyn8dxnBG3qwxEL6GdZGu8-YtlMb2Yz4v8fJW_rr4Q23PM-uZDA5Zb6CgGmRAN0dHp9Oxs0eXVUh-h0PyFkHTlUy9__dlDxeZQrjtlJb-H7gS64VMPmftoYNYP0N2mgwcODP0hWrQIeoMdfrDHD27xg6_X2OEHe_zgHn6www-GSQE_j1A-Pc_fzkjosEEUqN010VRImY0F15yyCZ-MS0MpL41MaCSE5lyqMuUyMhlNBRcpKJcTnTIFd-RYl4w-RsM17OAJwlwJYSRTQkWG6VIBN5dUGy0iozRoR0_RqDmdovJ1VIquYrY9ywLOsnBnWcDs4-YAi_C-7QoK2rl14lP67M9fP0e3O5Qeo2G93ZsXoDrW8mWg8E8B8HAT
linkProvider Library Specific Holdings
openUrl ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Htfd-yolo%3A+Small+target+detection+in+drone+aerial+photography+based+on+YOLOv8s&rft.jtitle=The+Journal+of+supercomputing&rft.au=Sun%2C+Yuheng&rft.au=Lan%2C+Zhenping&rft.au=Sun%2C+Yanguo&rft.au=Guo%2C+Yuepeng&rft.date=2025-03-01&rft.pub=Springer+Nature+B.V&rft.issn=0920-8542&rft.eissn=1573-0484&rft.volume=81&rft.issue=4&rft.spage=545&rft_id=info:doi/10.1007%2Fs11227-025-07067-3&rft.externalDBID=NO_FULL_TEXT
thumbnail_l http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1573-0484&client=summon
thumbnail_m http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1573-0484&client=summon
thumbnail_s http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1573-0484&client=summon