Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s
The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive...
        Saved in:
      
    
          | Published in | The Journal of supercomputing Vol. 81; no. 4; p. 545 | 
|---|---|
| Main Authors | , , , , , , | 
| Format | Journal Article | 
| Language | English | 
| Published | 
        New York
          Springer Nature B.V
    
        01.03.2025
     | 
| Subjects | |
| Online Access | Get full text | 
| ISSN | 1573-0484 0920-8542 1573-0484  | 
| DOI | 10.1007/s11227-025-07067-3 | 
Cover
| Abstract | The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive. This paper presents a novel YOLO-based method for detecting small targets, specifically tailored to UAV photography. Firstly, a detection head is formulated for small targets to provide higher-resolution feature mapping. Secondly, a three-scale feature fusion module is proposed as a means of fusing the network features with the underlying features. This is intended to improve the deep semantic feature fusion and shallow texture feature fusion, provide rich spatial information for different detection heads and address the issue of feature loss. Furthermore, a module for Feature Selection Guidance Module is proposed, which enhances the ability to discriminate small targets by combining the CNN and the nonlinear learning operator. Finally, Soft_NMS is introduced and combined with DIOU, and the DIOU_Soft_NMS algorithm is proposed as a replacement for the original nonextremely large value suppression method. This new algorithm solves target crowding effectively and overlapping. Experimental results show that exhibits superior detection performance in UAV aerial photography scenarios, achieving remarkable outcomes on the VisDrone2019 dataset. In the test set, mAP0.5 reached 45%, representing a 12.1% improvement in comparison with YOLOv8, while mAP0.5 − 0.95 reached 34.1%, indicating an 11.4% improvement in comparison with YOLOv8. This suggests that the method will have potential for use in practical tasks in the field of UAVs. Furthermore, the results provide a solid foundation for future related research. | 
    
|---|---|
| AbstractList | The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges, including the use of images that are too small, difficulties in distinguishing the target from the background, and the presence of target-intensive. This paper presents a novel YOLO-based method for detecting small targets, specifically tailored to UAV photography. Firstly, a detection head is formulated for small targets to provide higher-resolution feature mapping. Secondly, a three-scale feature fusion module is proposed as a means of fusing the network features with the underlying features. This is intended to improve the deep semantic feature fusion and shallow texture feature fusion, provide rich spatial information for different detection heads and address the issue of feature loss. Furthermore, a module for Feature Selection Guidance Module is proposed, which enhances the ability to discriminate small targets by combining the CNN and the nonlinear learning operator. Finally, Soft_NMS is introduced and combined with DIOU, and the DIOU_Soft_NMS algorithm is proposed as a replacement for the original nonextremely large value suppression method. This new algorithm solves target crowding effectively and overlapping. Experimental results show that exhibits superior detection performance in UAV aerial photography scenarios, achieving remarkable outcomes on the VisDrone2019 dataset. In the test set, mAP0.5 reached 45%, representing a 12.1% improvement in comparison with YOLOv8, while mAP0.5 − 0.95 reached 34.1%, indicating an 11.4% improvement in comparison with YOLOv8. This suggests that the method will have potential for use in practical tasks in the field of UAVs. Furthermore, the results provide a solid foundation for future related research. | 
    
| ArticleNumber | 545 | 
    
| Author | Li, Xinxin Guo, Yuepeng Lan, Zhenping Meng, Yuwei Sun, Yuheng Sun, Yanguo Wang, Yuru  | 
    
| Author_xml | – sequence: 1 givenname: Yuheng surname: Sun fullname: Sun, Yuheng – sequence: 2 givenname: Zhenping surname: Lan fullname: Lan, Zhenping – sequence: 3 givenname: Yanguo surname: Sun fullname: Sun, Yanguo – sequence: 4 givenname: Yuepeng surname: Guo fullname: Guo, Yuepeng – sequence: 5 givenname: Xinxin surname: Li fullname: Li, Xinxin – sequence: 6 givenname: Yuru surname: Wang fullname: Wang, Yuru – sequence: 7 givenname: Yuwei surname: Meng fullname: Meng, Yuwei  | 
    
| BookMark | eNpNkL1OwzAYRS1UJNrCCzBZYjbY_hLbYUMVf1JFB7owWU5st6nSONguUt-eQBmY7h2O7pXODE360DuErhm9ZZTKu8QY55JQXhIqqZAEztCUlRIILVQx-dcv0CylHaW0AAlT9PaSvSXH0IV7_L43XYeziRuXsXXZNbkNPW57bON4h42LrenwsA05bKIZtkdcm-QsHqGP1XL1pdIlOvemS-7qL-do_fS4XryQ5er5dfGwJA3nIhMLpq4lM8oqKCpVMe8AlHd1CdQYq1TdeKFq6iQIo4ygFausKJpKCG59AXN0c5odYvg8uJT1LhxiPz5q4KUQtFQAI8VPVBNDStF5PcR2b-JRM6p_tOmTNj1q07_aNMA3Zt9hUw | 
    
| Cites_doi | 10.1109/ICRA57147.2024.10610273 10.1016/j.eswa.2022.118665 10.1109/ACCESS.2023.3300372 10.1109/ICCVW54120.2021.00316 10.1109/CVPR52729.2023.00721 10.23919/MVA57639.2023.10215748 10.1109/TPAMI.2016.2577031 10.3390/drones7080526 10.1109/CVPR42600.2020.01261 10.1007/978-3-319-46448-0_2 10.13164/re.2024.0012 10.1016/j.patcog.2023.110041 10.1109/ICCV.2015.169 10.1007/978-3-030-01249-6_23 10.1109/CVPR52729.2023.01291 10.3390/rs15143468 10.1109/WACV48630.2021.00330 10.1109/TGRS.2023.3298852 10.1109/ICCVW54120.2021.00312 10.1007/978-981-13-9042-5_56 10.1016/j.patrec.2023.10.028 10.1109/CVPR.2016.91 10.1109/TIP.2013.2281420 10.1109/WACV48630.2021.00120 10.1016/j.isprsjprs.2023.04.009 10.1609/aaai.v34i07.6999 10.1109/ACCESS.2019.2961959 10.1109/ICCV.2017.324 10.1016/j.jvcir.2023.103936 10.1109/MSP.2017.2765202 10.1109/ICCV.2017.593 10.1007/978-3-031-26409-2_27 10.1109/CVPR.2018.00644 10.1109/CVPR.2017.106  | 
    
| ContentType | Journal Article | 
    
| Copyright | The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025. | 
    
| Copyright_xml | – notice: The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2025. | 
    
| DBID | AAYXX CITATION JQ2  | 
    
| DOI | 10.1007/s11227-025-07067-3 | 
    
| DatabaseName | CrossRef ProQuest Computer Science Collection  | 
    
| DatabaseTitle | CrossRef ProQuest Computer Science Collection  | 
    
| DatabaseTitleList | ProQuest Computer Science Collection | 
    
| DeliveryMethod | fulltext_linktorsrc | 
    
| Discipline | Computer Science | 
    
| EISSN | 1573-0484 | 
    
| ExternalDocumentID | 10_1007_s11227_025_07067_3 | 
    
| GroupedDBID | -~C .4S .86 .DC .VR 06D 0R~ 0VY 123 199 1N0 203 29L 2J2 2JN 2JY 2KG 2KM 2LR 2~H 30V 4.4 406 408 409 40D 40E 5VS 67Z 6NX 78A 8TC 8UJ 95- 95. 95~ 96X AAAVM AABHQ AACDK AAHNG AAIAL AAJBT AAJKR AANZL AAPKM AARTL AASML AATNV AATVU AAUYE AAWCG AAYIU AAYQN AAYXX AAYZH ABAKF ABBBX ABBRH ABBXA ABDBE ABDBF ABDZT ABECU ABFSG ABFTD ABFTV ABHLI ABHQN ABJNI ABJOX ABKCH ABKTR ABMNI ABMQK ABNWP ABQBU ABRTQ ABSXP ABTEG ABTHY ABTKH ABTMW ABWNU ABXPI ACAOD ACDTI ACGFS ACHSB ACHXU ACKNC ACMDZ ACMLO ACOKC ACOMO ACPIV ACSTC ACZOJ ADHHG ADHIR ADIMF ADKFA ADKNI ADKPE ADRFC ADTPH ADURQ ADYFF ADZKW AEFQL AEGAL AEGNC AEJHL AEJRE AEMSY AENEX AEOHA AEPYU AESKC AETLH AEVLU AEXYK AEZWR AFBBN AFDZB AFHIU AFLOW AFOHR AFQWF AFWTZ AFZKB AGAYW AGDGC AGMZJ AGQEE AGQMX AGRTI AGWIL AGWZB AGYKE AHAVH AHBYD AHPBZ AHSBF AHWEU AHYZX AIAKS AIGIU AIIXL AILAN AITGF AIXLP AJRNO AJZVZ ALMA_UNASSIGNED_HOLDINGS ALWAN AMKLP AMXSW AMYLF AMYQR AOCGG ARMRJ ASPBG ATHPR AVWKF AXYYD AYFIA AYJHY AZFZN B-. BA0 BGNMA BSONS CITATION CS3 CSCUP DDRTE DL5 DNIVK DPUIP DU5 EAP EBLON EBS EIOEI ESBYG ESX F5P FEDTE FERAY FFXSO FIGPU FNLPD FRRFC FWDCC GGCAI GGRSB GJIRD GNWQR GQ7 GQ8 GXS HF~ HG5 HG6 HMJXF HQYDN HRMNR HVGLF HZ~ I09 IHE IJ- IKXTQ ITM IWAJR IXC IZIGR IZQ I~X I~Z J-C J0Z JBSCW JCJTX JZLTJ KDC KOV LAK LLZTM M4Y MA- N9A NB0 NPVJJ NQJWS NU0 O9- O93 O9G O9I O9J OAM P19 P2P P9O PF0 PT4 PT5 QOK QOS R89 R9I RHV ROL RPX RSV S16 S1Z S27 S3B SAP SCJ SCO SDH SDM SHX SISQX SJYHP SNE SNPRN SNX SOHCF SOJ SPISZ SRMVM SSLCW STPWE SZN T13 TSG TSK TSV TUC TUS U2A UG4 UOJIU UTJUX VC2 W23 W48 WH7 WK8 YLTOR Z45 ZMTXR ~EX JQ2  | 
    
| ID | FETCH-LOGICAL-c226t-d3abb71a8d8349891fe338feb530aad88bcf68b0e736a8a60919d64c9662df43 | 
    
| ISSN | 1573-0484 0920-8542  | 
    
| IngestDate | Mon Oct 06 18:35:58 EDT 2025 Wed Oct 01 06:51:13 EDT 2025  | 
    
| IsPeerReviewed | true | 
    
| IsScholarly | true | 
    
| Issue | 4 | 
    
| Language | English | 
    
| LinkModel | OpenURL | 
    
| MergedId | FETCHMERGED-LOGICAL-c226t-d3abb71a8d8349891fe338feb530aad88bcf68b0e736a8a60919d64c9662df43 | 
    
| Notes | ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14  | 
    
| PQID | 3256605833 | 
    
| PQPubID | 2043774 | 
    
| ParticipantIDs | proquest_journals_3256605833 crossref_primary_10_1007_s11227_025_07067_3  | 
    
| ProviderPackageCode | CITATION AAYXX  | 
    
| PublicationCentury | 2000 | 
    
| PublicationDate | 2025-03-01 | 
    
| PublicationDateYYYYMMDD | 2025-03-01 | 
    
| PublicationDate_xml | – month: 03 year: 2025 text: 2025-03-01 day: 01  | 
    
| PublicationDecade | 2020 | 
    
| PublicationPlace | New York | 
    
| PublicationPlace_xml | – name: New York | 
    
| PublicationTitle | The Journal of supercomputing | 
    
| PublicationYear | 2025 | 
    
| Publisher | Springer Nature B.V | 
    
| Publisher_xml | – name: Springer Nature B.V | 
    
| References | 7067_CR21 S Huang (7067_CR1) 2024; 147 W Fang (7067_CR20) 2019; 8 P Bharati (7067_CR17) 2020; 2019 7067_CR23 7067_CR22 X Li (7067_CR26) 2023; 199 7067_CR29 7067_CR28 C Gao (7067_CR3) 2013; 22 A Creswell (7067_CR24) 2018; 35 7067_CR2 S Ren (7067_CR19) 2016; 39 7067_CR8 7067_CR9 7067_CR10 7067_CR32 7067_CR31 7067_CR12 7067_CR34 7067_CR5 7067_CR11 7067_CR33 7067_CR6 7067_CR14 7067_CR36 7067_CR13 7067_CR35 7067_CR16 7067_CR15 7067_CR37 X Wang (7067_CR4) 2023; 176 7067_CR18 L Zhou (7067_CR30) 2023; 15 J Xiao (7067_CR25) 2023; 211 Z Zhang (7067_CR7) 2023; 7 S Cao (7067_CR27) 2023; 97  | 
    
| References_xml | – ident: 7067_CR14 doi: 10.1109/ICRA57147.2024.10610273 – volume: 211 start-page: 118665 year: 2023 ident: 7067_CR25 publication-title: Expert Syst Appl doi: 10.1016/j.eswa.2022.118665 – ident: 7067_CR35 doi: 10.1109/ACCESS.2023.3300372 – ident: 7067_CR2 doi: 10.1109/ICCVW54120.2021.00316 – ident: 7067_CR29 doi: 10.1109/CVPR52729.2023.00721 – ident: 7067_CR9 doi: 10.23919/MVA57639.2023.10215748 – volume: 39 start-page: 1137 issue: 6 year: 2016 ident: 7067_CR19 publication-title: IEEE Trans Pattern Anal Mach Intell doi: 10.1109/TPAMI.2016.2577031 – volume: 7 start-page: 526 issue: 8 year: 2023 ident: 7067_CR7 publication-title: Drones doi: 10.3390/drones7080526 – ident: 7067_CR11 doi: 10.1109/CVPR42600.2020.01261 – ident: 7067_CR21 doi: 10.1007/978-3-319-46448-0_2 – ident: 7067_CR13 doi: 10.13164/re.2024.0012 – volume: 147 start-page: 110041 year: 2024 ident: 7067_CR1 publication-title: Pattern Recogn doi: 10.1016/j.patcog.2023.110041 – ident: 7067_CR18 doi: 10.1109/ICCV.2015.169 – ident: 7067_CR5 doi: 10.1007/978-3-030-01249-6_23 – ident: 7067_CR36 – ident: 7067_CR31 doi: 10.1109/CVPR52729.2023.01291 – volume: 15 start-page: 3468 issue: 14 year: 2023 ident: 7067_CR30 publication-title: Remote Sens doi: 10.3390/rs15143468 – ident: 7067_CR6 doi: 10.1109/WACV48630.2021.00330 – ident: 7067_CR28 doi: 10.1109/TGRS.2023.3298852 – ident: 7067_CR34 doi: 10.1109/ICCVW54120.2021.00312 – volume: 2019 start-page: 657 year: 2020 ident: 7067_CR17 publication-title: Comput Intell Pattern Recogn: Proc CIPR doi: 10.1007/978-981-13-9042-5_56 – volume: 176 start-page: 153 year: 2023 ident: 7067_CR4 publication-title: Pattern Recogn Lett doi: 10.1016/j.patrec.2023.10.028 – ident: 7067_CR12 doi: 10.1109/CVPR.2016.91 – volume: 22 start-page: 4996 issue: 12 year: 2013 ident: 7067_CR3 publication-title: IEEE Trans Image Process doi: 10.1109/TIP.2013.2281420 – ident: 7067_CR10 doi: 10.1109/WACV48630.2021.00120 – volume: 199 start-page: 242 year: 2023 ident: 7067_CR26 publication-title: ISPRS J Photogramm Remote Sens doi: 10.1016/j.isprsjprs.2023.04.009 – ident: 7067_CR16 doi: 10.1609/aaai.v34i07.6999 – volume: 8 start-page: 1935 year: 2019 ident: 7067_CR20 publication-title: Ieee Access doi: 10.1109/ACCESS.2019.2961959 – ident: 7067_CR37 – ident: 7067_CR22 doi: 10.1109/ICCV.2017.324 – volume: 97 start-page: 103936 year: 2023 ident: 7067_CR27 publication-title: J Vis Commun Image Represent doi: 10.1016/j.jvcir.2023.103936 – volume: 35 start-page: 53 issue: 1 year: 2018 ident: 7067_CR24 publication-title: IEEE Signal Process Mag doi: 10.1109/MSP.2017.2765202 – ident: 7067_CR15 doi: 10.1109/ICCV.2017.593 – ident: 7067_CR23 – ident: 7067_CR33 doi: 10.1007/978-3-031-26409-2_27 – ident: 7067_CR32 doi: 10.1109/CVPR.2018.00644 – ident: 7067_CR8 doi: 10.1109/CVPR.2017.106  | 
    
| SSID | ssj0004373 | 
    
| Score | 2.3709085 | 
    
| Snippet | The detection of small targets has significant value in the field of unmanned aerial vehicle (UAV) vision, yet it is also subject to certain challenges,... | 
    
| SourceID | proquest crossref  | 
    
| SourceType | Aggregation Database Index Database  | 
    
| StartPage | 545 | 
    
| SubjectTerms | Accuracy Aerial photography Algorithms Artificial intelligence Cameras Drone aircraft Drones Feature selection Localization Modules Multisensor fusion Semantics Spatial data Target detection Target recognition Unmanned aerial vehicles  | 
    
| Title | Htfd-yolo: Small target detection in drone aerial photography based on YOLOv8s | 
    
| URI | https://www.proquest.com/docview/3256605833 | 
    
| Volume | 81 | 
    
| hasFullText | 1 | 
    
| inHoldings | 1 | 
    
| isFullTextHit | |
| isPrint | |
| journalDatabaseRights | – providerCode: PRVLSH databaseName: SpringerLink Journals customDbUrl: mediaType: online eissn: 1573-0484 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004373 issn: 1573-0484 databaseCode: AFBBN dateStart: 19970101 isFulltext: true providerName: Library Specific Holdings – providerCode: PRVAVX databaseName: SpringerLINK - Czech Republic Consortium customDbUrl: eissn: 1573-0484 dateEnd: 99991231 omitProxy: false ssIdentifier: ssj0004373 issn: 1573-0484 databaseCode: AGYKE dateStart: 19970101 isFulltext: true titleUrlDefault: http://link.springer.com providerName: Springer Nature – providerCode: PRVAVX databaseName: SpringerLink Journals (ICM) customDbUrl: eissn: 1573-0484 dateEnd: 99991231 omitProxy: true ssIdentifier: ssj0004373 issn: 1573-0484 databaseCode: U2A dateStart: 19970101 isFulltext: true titleUrlDefault: http://www.springerlink.com/journals/ providerName: Springer Nature  | 
    
| link | http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnV1Jj9MwFLZK58KFHbEMyAfEpfIojZ3E5TaDpqpQaQ9kpA6XyFuYOdCGNkUC8eN5XrIURgi4RJVrua7fl_ee34rQq4hpk3FGSUpFRFg5oURypgkt4e0S41Qlqc13fr9IZxfs3SpZDQY_-tkltTxR32_MK_kfqsIY0NVmyf4DZdtFYQA-A33hCRSG51_ReFaXmnwD9mXv9R8-Wy-zj-weaVMb1cQx6u0GVEnhdjSqrjZ1KFM9siJMW3fB5XK-_Mp3fU21yxlz2upuX5mtci0gGmHnXEmOaV3ur0w3OPc21Y8wVt0w1xpIN23Yz37jF7C9eD_1TRBx0sVgHZggbXy19Xq0KTLezgj3U574ElonJnDZDBbwRUxbNuw7twS4sR5PTXy9yd94fRRyn8dxnBG3qwxEL6GdZGu8-YtlMb2Yz4v8fJW_rr4Q23PM-uZDA5Zb6CgGmRAN0dHp9Oxs0eXVUh-h0PyFkHTlUy9__dlDxeZQrjtlJb-H7gS64VMPmftoYNYP0N2mgwcODP0hWrQIeoMdfrDHD27xg6_X2OEHe_zgHn6www-GSQE_j1A-Pc_fzkjosEEUqN010VRImY0F15yyCZ-MS0MpL41MaCSE5lyqMuUyMhlNBRcpKJcTnTIFd-RYl4w-RsM17OAJwlwJYSRTQkWG6VIBN5dUGy0iozRoR0_RqDmdovJ1VIquYrY9ywLOsnBnWcDs4-YAi_C-7QoK2rl14lP67M9fP0e3O5Qeo2G93ZsXoDrW8mWg8E8B8HAT | 
    
| linkProvider | Library Specific Holdings | 
    
| openUrl | ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=Htfd-yolo%3A+Small+target+detection+in+drone+aerial+photography+based+on+YOLOv8s&rft.jtitle=The+Journal+of+supercomputing&rft.au=Sun%2C+Yuheng&rft.au=Lan%2C+Zhenping&rft.au=Sun%2C+Yanguo&rft.au=Guo%2C+Yuepeng&rft.date=2025-03-01&rft.pub=Springer+Nature+B.V&rft.issn=0920-8542&rft.eissn=1573-0484&rft.volume=81&rft.issue=4&rft.spage=545&rft_id=info:doi/10.1007%2Fs11227-025-07067-3&rft.externalDBID=NO_FULL_TEXT | 
    
| thumbnail_l | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/lc.gif&issn=1573-0484&client=summon | 
    
| thumbnail_m | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/mc.gif&issn=1573-0484&client=summon | 
    
| thumbnail_s | http://covers-cdn.summon.serialssolutions.com/index.aspx?isbn=/sc.gif&issn=1573-0484&client=summon |