Decentralized Multi-Agent Pursuit Using Deep Reinforcement Learning

Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omnidirectional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given...

Full description

Saved in:

Bibliographic Details
Published in	IEEE robotics and automation letters Vol. 6; no. 3; pp. 4552 - 4559
Main Authors	de Souza, Cristino, Newbury, Rhys, Cosgun, Akansel, Castillo, Pedro, Vidolov, Boris, Kuli, Dana
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.07.2021 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Algorithms Constraints cooperating robots Curricula Deep learning Drones Games Kinematics Machine learning Multi-robot systems Multiagent systems Reinforcement learning Task analysis Training Trajectory
Online Access	Get full text
ISSN	2377-3766 2377-3766
DOI	10.1109/LRA.2021.3068952

Cover

More Information
Summary:	Pursuit-evasion is the problem of capturing mobile targets with one or more pursuers. We use deep reinforcement learning for pursuing an omnidirectional target with multiple, homogeneous agents that are subject to unicycle kinematic constraints. We use shared experience to train a policy for a given number of pursuers, executed independently by each agent at run-time. The training uses curriculum learning, a sweeping-angle ordering to locally represent neighboring agents, and a reward structure that encourages a good formation and combines individual and group rewards. Simulated experiments with a reactive evader and up to eight pursuers show that our learning-based approach outperforms recent reinforcement learning techniques as well as non-holonomic adaptations of classical algorithms. The learned policy is successfully transferred to the real-world in a proof-of-concept demonstration with three motion-constrained pursuer drones.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2377-3766 2377-3766
DOI:	10.1109/LRA.2021.3068952