Multi-agent deep reinforcement learning for solving large-scale air traffic flow management problem: a time-step sequential decision approach

Date published

2022-11-15

Free to read from

Supervisor/s

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Department

Type

Conference paper

ISSN

Format

Citation

Tang Y, Xu Y. (2021) Multi-agent deep reinforcement learning for solving large-scale air traffic flow management problem: a time-step sequential decision approach. In: 2021 AIAA/IEEE 40th Digital Avionics Systems Conference (DASC), 3-7 October 2021, San Antonio, USA

Abstract

In this paper, we focus on the demand-capacity balancing (DCB) problem in air traffic flow management, which is considered as a fully cooperative multi-agent learning task. First, a rule-based time-step environment is designed to mimic the DCB process. In this environment, each agent ‘flight’ decides its action at valid time steps. Three different rules are defined, based on the remaining capacity and the number of cooperative flights in each sector, to ease the learning process. Second, a multi-agent reinforcement learning framework, built on the proximal policy optimization (MAPPO), is proposed by using the parameter sharing mechanism and the mean-field approximation method, where an inherent feature of all other agents is extracted to address the credit assignment problem. Moreover, a supervisor integrated MAPPO framework is proposed, where a supervisor is designed to generate supervised actions, in such a way to further improve the learning performance. In the experiments, two performance indices, Search Capability and Generalization Capability, are considered. Both indices are assessed with the evaluation of two toy cases and a real-world case study. Results suggest that, the supervisor integrated MAPPO with supervised actions achieves the best performance across the different cases; other proposed methods also show some promising Search Capability, but only prove an acceptable Generalization Capability in simpler cases than the training cases.

Description

Software Description

Software Language

Github

Keywords

air traffic flow management, demand-capacity balance, multi-agent reinforcement learning, proximal policy optimization

DOI

Rights

Attribution-NonCommercial 4.0 International

Relationships

Relationships

Supplements

Funder/s