基于图卷积深度强化学习的协同空战机动决策方法

TG142.71; 针对多无人机智能协同空战对抗决策问题,提出了一种基于长短期记忆与竞争图卷积深度强化学习的多机协同空战机动对抗决策方法.首先,对多机协同空战对抗问题进行描述;其次,在竞争Q网络中,引入长短期记忆网络用于处理带有强时序相关性的空战信息,接着,搭建图卷积网络作为多机之间的通信基础,提出基于长短期记忆与竞争图卷积深度强化学习算法的协同空战训练框架,并对协同空战决策训练算法进行了设计.二对一空战仿真结果验证了本文所提出的协同智能对抗决策方法的有效性,其具有决策速度快、学习过程稳定的特点以及适应空战环境快速变化下的协同策略学习能力....

Full description

Saved in:

Bibliographic Details
Published in	工程科学学报 Vol. 46; no. 7; pp. 1227 - 1236
Main Authors	欧洋, 郭正玉, 罗德林, 缪克华
Format	Journal Article
Language	Chinese
Published	空基信息感知与融合全国重点实验室,洛阳 471000 01.07.2024 厦门大学航空航天学院,厦门 361102%中国空空导弹研究院,洛阳 471000%厦门大学航空航天学院,厦门 361102
Subjects	多机协同 maneuver decision making 深度强化学习空战决策 multi-unmanned aerial vehicle 无人机 air combat decision-making deep reinforcement learning multimachine collaboration 机动决策
Online Access	Get full text
ISSN	2095-9389
DOI	10.13374/j.issn2095-9389.2023.09.25.004

Cover

Abstract	TG142.71; 针对多无人机智能协同空战对抗决策问题,提出了一种基于长短期记忆与竞争图卷积深度强化学习的多机协同空战机动对抗决策方法.首先,对多机协同空战对抗问题进行描述;其次,在竞争Q网络中,引入长短期记忆网络用于处理带有强时序相关性的空战信息,接着,搭建图卷积网络作为多机之间的通信基础,提出基于长短期记忆与竞争图卷积深度强化学习算法的协同空战训练框架,并对协同空战决策训练算法进行了设计.二对一空战仿真结果验证了本文所提出的协同智能对抗决策方法的有效性,其具有决策速度快、学习过程稳定的特点以及适应空战环境快速变化下的协同策略学习能力.
AbstractList	TG142.71; 针对多无人机智能协同空战对抗决策问题,提出了一种基于长短期记忆与竞争图卷积深度强化学习的多机协同空战机动对抗决策方法.首先,对多机协同空战对抗问题进行描述;其次,在竞争Q网络中,引入长短期记忆网络用于处理带有强时序相关性的空战信息,接着,搭建图卷积网络作为多机之间的通信基础,提出基于长短期记忆与竞争图卷积深度强化学习算法的协同空战训练框架,并对协同空战决策训练算法进行了设计.二对一空战仿真结果验证了本文所提出的协同智能对抗决策方法的有效性,其具有决策速度快、学习过程稳定的特点以及适应空战环境快速变化下的协同策略学习能力.
Abstract_FL	ABSTRACT The effective implementation of multi-unmanned aerial vehicle (UAV) decision making and improvement in the efficiency of coordinated mission execution are currently the top priorities of air combat research. To solve the problem of multi-UAV cooperative air combat maneuvering confrontation,a multi-UAV cooperative air combat maneuvering confrontation decision-making method based on long short-term memory (LSTM) and convolutional deep reinforcement learning of competitive graphs is proposed. First,the problem of multi-UAV cooperative air combat maneuvering confrontation is described. Second,in the deep dueling Q network,the LSTM network is introduced to process air combat information with a strong temporal correlation. Further,a graph convolutional network is built as a communication basis between multiple UAVs and a cooperative air combat training framework based on LSTM,and a convolutional deep reinforcement learning algorithm for the dueling graph is proposed to improve the convergence. In the proposed method,the communication problem between UAVs is transformed into a graph model,where each UAV is regarded as a node,and the observation state of the UAV is regarded as the attribute of a node. The convolutional layer captures the cooperative relationship between each node,and communication between UAVs is realized through information sharing. Subsequently,the extracted air combat feature information with time sequence is inputted into the LSTM and deep dueling Q networks for evaluating action values. The LSTM network can process sequence information and encode historical states into the hidden state of the network so that the network can better capture temporal dependencies and thus predict the value function of the current state better. The simulation resultsshow that when the opponent adopts a nonmaneuvering strategy,the UAV formation developed using the proposed method as the core decision-making strategy can learn a reasonable maneuvering strategy and cooperate to a certain extent when facing an opponent using a fixed strategy. This proves the effectiveness of the algorithm in multi-UAV collaborative air combat maneuvering confrontation problems,enabling UAV formations to achieve teamwork and improve air combat efficiency. In a two-on-one air combat situation,the greedy algorithm is used as the decision-making strategy of the enemy aircraft. The results of simulation comparison experiments show that when faced with opponents using certain rules and strategies,the red team formation can learn reasonable maneuver confrontation strategies and cooperate in the decision-making process to form certain air combat tactics,which improve the combat efficiency of the red team. Compared with the basic method,the proposed method exhibits a more stable learning process and faster decision-making speed for UAV cooperative air combat.
Author	欧洋缪克华罗德林郭正玉
AuthorAffiliation	厦门大学航空航天学院,厦门 361102%中国空空导弹研究院,洛阳 471000%厦门大学航空航天学院,厦门 361102;空基信息感知与融合全国重点实验室,洛阳 471000
AuthorAffiliation_xml	– name: 厦门大学航空航天学院,厦门 361102%中国空空导弹研究院,洛阳 471000%厦门大学航空航天学院,厦门 361102;空基信息感知与融合全国重点实验室,洛阳 471000
Author_FL	GUO Zhengyu MIAO Kehua LUO Delin OU Yang
Author_FL_xml	– sequence: 1 fullname: OU Yang – sequence: 2 fullname: GUO Zhengyu – sequence: 3 fullname: LUO Delin – sequence: 4 fullname: MIAO Kehua
Author_xml	– sequence: 1 fullname: 欧洋 – sequence: 2 fullname: 郭正玉 – sequence: 3 fullname: 罗德林 – sequence: 4 fullname: 缪克华
BookMark	eNo9j8tKw0AYhWdRwVr7HOIi8Z9bJ7PU4g0KbnRdcpmRRknBQewDiIi02o1BVNBV1UUR6aI0iE_TSV7DiOLqg3PgO5wlVEm6iUJoBYOLKRVsLXY7xiQEJHck9aRLgFAXSnIXgFVQ9b9aRHVjOgFwTAWWBKpowz5l8-zaPnzZwbR4ec-nHzYb2c_M9lM7Hs1nz8X9uR3c2GG_eMvyy7v8MbNXr_ZiUozTPJ3lk9tltKD9Y6Pqf6yhg63N_eaO09rb3m2utxyDgQkn1BDpUBEBoVaaK-UB1VKEzCOgggYhPudhQwaeJz3JaRhgJgnVUUQw-8lpDa3-es_8RPvJYTvunp4k5WI7iI_iqNcLyuMMBICg3yyramU
ClassificationCodes	TG142.71
ContentType	Journal Article
Copyright	Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
Copyright_xml	– notice: Copyright © Wanfang Data Co. Ltd. All Rights Reserved.
DBID	2B. 4A8 92I 93N PSX TCJ
DOI	10.13374/j.issn2095-9389.2023.09.25.004
DatabaseName	Wanfang Data Journals - Hong Kong WANFANG Data Centre Wanfang Data Journals 万方数据期刊 - 香港版 China Online Journals (COJ) China Online Journals (COJ)
DatabaseTitleList
DeliveryMethod	fulltext_linktorsrc
DocumentTitle_FL	Collaborative air combat maneuvering decision-making method based on graph convolutional deep reinforcement learning
EndPage	1236
ExternalDocumentID	bjkjdxxb202407007
GrantInformation_xml	– fundername: (厦门市科技局?厦门市产学研项目); (空基信息感知与融合全国重点实验室与航空科学基金联合资助项目) funderid: (厦门市科技局?厦门市产学研项目); (空基信息感知与融合全国重点实验室与航空科学基金联合资助项目)
GroupedDBID	-0C -SC -S~ 2B. 2RA 4A8 5VR 92I 92M 93N 9D9 9DC AAITT AFUIB ALMA_UNASSIGNED_HOLDINGS CAJEC CQIGP FA0 GROUPED_DOAJ JUIAU PB1 PB6 PSX Q-- Q-2 R-C RT3 T8S TCJ U1F U5C
ID	FETCH-LOGICAL-s1047-cf0dfce270cfef5ee803f97c4820eb622a55c69b8898953cb14923fdd214c69b3
ISSN	2095-9389
IngestDate	Thu May 29 04:07:32 EDT 2025
IsPeerReviewed	false
IsScholarly	true
Issue	7
Keywords	多机协同 maneuver decision making 深度强化学习空战决策 multi-unmanned aerial vehicle 无人机 air combat decision-making deep reinforcement learning multimachine collaboration 机动决策
Language	Chinese
LinkModel	OpenURL
MergedId	FETCHMERGED-LOGICAL-s1047-cf0dfce270cfef5ee803f97c4820eb622a55c69b8898953cb14923fdd214c69b3
PageCount	10
ParticipantIDs	wanfang_journals_bjkjdxxb202407007
PublicationCentury	2000
PublicationDate	2024-07-01
PublicationDateYYYYMMDD	2024-07-01
PublicationDate_xml	– month: 07 year: 2024 text: 2024-07-01 day: 01
PublicationDecade	2020
PublicationTitle	工程科学学报
PublicationTitle_FL	Chinese Journal of Engineering
PublicationYear	2024
Publisher	空基信息感知与融合全国重点实验室,洛阳 471000 厦门大学航空航天学院,厦门 361102%中国空空导弹研究院,洛阳 471000%厦门大学航空航天学院,厦门 361102
Publisher_xml	– name: 厦门大学航空航天学院,厦门 361102%中国空空导弹研究院,洛阳 471000%厦门大学航空航天学院,厦门 361102 – name: 空基信息感知与融合全国重点实验室,洛阳 471000
SSID	ssib051371920 ssib023167159 ssj0003313525 ssib022319478 ssib041261352 ssib036435564
Score	2.3989651
Snippet	TG142.71;...
SourceID	wanfang
SourceType	Aggregation Database
StartPage	1227
Title	基于图卷积深度强化学习的协同空战机动决策方法
URI	https://d.wanfangdata.com.cn/periodical/bjkjdxxb202407007
Volume	46
hasFullText	1
inHoldings	1
isFullTextHit
isPrint
link	http://utb.summon.serialssolutions.com/2.0.0/link/0/eLvHCXMwnR1daxQxcKkVxBdRVPymiMEHubqbj032cfdujyLoUwt9K7dfSoUTbAulzyIirfpiERX0qepDEelD6SH-An9G9-5vODOb7q1YpApHyE4mM5OZzWaSSyaOc6NwPZN7omhxk_stWf2_m_GWj4FlikxnBZ2Qu3vPn5mTd-bV_MSxn41dSyvLyXS6dui5kv-xKsDArnhK9h8sWxMFAOTBvpCChSE9ko1ZrFjQZVHIYompiQkSsYgypsMizWLNQvh1WezjY-RhESCHPmXaVB2Q2ywgSNihIiAYsNDF6gFQlpag6RILF_GRckDVfWYMCwxmgpogsDCUAb6CkDvEwscUiMcEry6_PPCPSSTAVIRvmIms_IHXkK3O-MRFHbw1CAjbiI60wUuOxiUBMwKrIQpUJoFAXyYYo4ByQEJNMsSkOhA1ZoH-DQXoV-1TJF2llbi5esJlvdO2et8bmmoarMtCj9rgWfOAlqEUNd4lFQCOIaMalAP4WNUby7_SL9q7Qw3UhAMpJ_0CQowVoQhERoMRJJTQr6yGoC6UguUicUtiCCZ3PDJwFy_YFFZDdhizK7lVd9WNMcnjVfQF699guJ1Dx04htKTBE3nULKZBawJDAXNcfJRjt6HezJksPlzMVlcTVC8MIBja4TjX4Mk2Fjjg6w5-qBc0gsdxjMHQcKYF-MaqEblIejC5F6p2vpUntHdwtRr6WUJgMe5RroU94dw8aMntv7eDzuj1i17_fsOdnD3tnLLzwKmw6tRnnIm1B2edqPww2B-8KN_9KDd2R5--Dne_lYOt8vugXN8st7f29z6O3j4pN16Wr9ZHXwbDZ2-G7wfl88_l053R9uZwc2-48_qcM9eNZ9szLXvJSWuJoqSkhZsVac61mxZ5ofLcuKIIdCrBNc8Tn_OeUqkfJAbveVUiTTwMqVhkGfckwsV5Z7L_qJ9fcKY0z5XJMpFylclEwsyj0D2eGZ4pmHQUvYvOddvmBfsRW1r4w3iXjoJ02Tk57kxXnMnlxyv5VXDOl5NrZPNfVPerTw
linkProvider	Directory of Open Access Journals
openUrl	ctx_ver=Z39.88-2004&ctx_enc=info%3Aofi%2Fenc%3AUTF-8&rfr_id=info%3Asid%2Fsummon.serialssolutions.com&rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Ajournal&rft.genre=article&rft.atitle=%E5%9F%BA%E4%BA%8E%E5%9B%BE%E5%8D%B7%E7%A7%AF%E6%B7%B1%E5%BA%A6%E5%BC%BA%E5%8C%96%E5%AD%A6%E4%B9%A0%E7%9A%84%E5%8D%8F%E5%90%8C%E7%A9%BA%E6%88%98%E6%9C%BA%E5%8A%A8%E5%86%B3%E7%AD%96%E6%96%B9%E6%B3%95&rft.jtitle=%E5%B7%A5%E7%A8%8B%E7%A7%91%E5%AD%A6%E5%AD%A6%E6%8A%A5&rft.au=%E6%AC%A7%E6%B4%8B&rft.au=%E9%83%AD%E6%AD%A3%E7%8E%89&rft.au=%E7%BD%97%E5%BE%B7%E6%9E%97&rft.au=%E7%BC%AA%E5%85%8B%E5%8D%8E&rft.date=2024-07-01&rft.pub=%E7%A9%BA%E5%9F%BA%E4%BF%A1%E6%81%AF%E6%84%9F%E7%9F%A5%E4%B8%8E%E8%9E%8D%E5%90%88%E5%85%A8%E5%9B%BD%E9%87%8D%E7%82%B9%E5%AE%9E%E9%AA%8C%E5%AE%A4%2C%E6%B4%9B%E9%98%B3+471000&rft.issn=2095-9389&rft.volume=46&rft.issue=7&rft.spage=1227&rft.epage=1236&rft_id=info:doi/10.13374%2Fj.issn2095-9389.2023.09.25.004&rft.externalDocID=bjkjdxxb202407007
thumbnail_s	http://utb.summon.serialssolutions.com/2.0.0/image/custom?url=http%3A%2F%2Fwww.wanfangdata.com.cn%2Fimages%2FPeriodicalImages%2Fbjkjdxxb%2Fbjkjdxxb.jpg