Swarm intelligence in cooperative environments: N-step dynamic tree search algorithm extended analysis

Espinós Longa, Marc; Tsourdos, Antonios; Inalhan, Gokhan

CERES Home
→
School of Aerospace, Transport and Manufacturing (SATM)
→
Staff publications (SATM)
→
View Item

dc.contributor.author	Espinós Longa, Marc
dc.contributor.author	Tsourdos, Antonios
dc.contributor.author	Inalhan, Gokhan
dc.date.accessioned	2022-09-22T11:16:49Z
dc.date.available	2022-09-22T11:16:49Z
dc.date.issued	2022-09-05
dc.identifier.citation	Espinós Longa M, Tsourdos A, Inalhan G. (2022) Swarm intelligence in cooperative environments: N-step dynamic tree search algorithm extended analysis. In: 2022 American Control Conference (ACC), 8-10 June 2022, Atlanta, GA, USA. pp. 761-766	en_UK
dc.identifier.isbn	978-1-6654-9480-9
dc.identifier.issn	0743-1619
dc.identifier.uri	https://doi.org/10.23919/ACC53348.2022.9867171
dc.identifier.uri	https://dspace.lib.cranfield.ac.uk/handle/1826/18463
dc.description.abstract	Reinforcement learning tree-based planning methods have been gaining popularity in the last few years due to their success in single-agent domains, where a perfect simulator model is available, e.g., Go and chess strategic board games. This paper pretends to extend tree search algorithms to the multi-agent setting in a decentralized structure, dealing with scalability issues and exponential growth of computational resources. The N-Step Dynamic Tree Search combines forward planning and direct temporal-difference updates, outperforming markedly state-of-the-art algorithms such as Q-Learning and SARSA. Future state transitions and rewards are predicted with a model built and learned from real interactions between agents and the environment. As an extension of previous work, this paper analyses the developed algorithm in the Hunter-Pursuit cooperative game against intelligent evaders. The N-Step Dynamic Tree Search aims to adapt the most successful single-agent learning methods to the multi-agent boundaries and demonstrates to be a remarkable advance compared to conventional temporal-difference techniques.	en_UK
dc.description.sponsorship	Engineering and Physical Sciences Research Council (EPSRC): 2454254. BAE Systems	en_UK
dc.language.iso	en	en_UK
dc.publisher	IEEE	en_UK
dc.rights	Attribution-NonCommercial 4.0 International	*
dc.rights.uri	http://creativecommons.org/licenses/by-nc/4.0/	*
dc.subject	Learning systems	en_UK
dc.subject	Q-learning	en_UK
dc.subject	Heuristic algorithms	en_UK
dc.subject	Computational modeling	en_UK
dc.subject	Scalability	en_UK
dc.subject	Games	en_UK
dc.subject	Predictive models	en_UK
dc.title	Swarm intelligence in cooperative environments: N-step dynamic tree search algorithm extended analysis	en_UK
dc.type	Conference paper	en_UK
dc.identifier.eisbn	978-1-6654-5196-3