Demand and capacity balancing technology based on multi-agent reinforcement learning
Date published
Free to read from
Authors
Supervisor/s
Journal Title
Journal ISSN
Volume Title
Publisher
Department
Type
ISSN
Format
Citation
Abstract
To effectively solve Demand and Capacity Balancing (DCB) in large-scale and high-density scenarios through the Ground Delay Program (GDP) in the pre-tactical stage, a sequential decision-making framework based on a time window is proposed. On this basis, the problem is transformed into Markov Decision Process (MDP) based on local observation, and then Multi-Agent Reinforcement Learning (MARL) method is adopted. Each flight is regarded as an independent agent to decide whether to implement GDP according to its local state observation. By designing the reward function in multiple combinations, a Mixed Competition and Cooperation (MCC) mode considering fairness is formed among agents. To improve the efficiency of MARL, we use the double Q-Learning Network (DQN), experience replay technology, adaptive ϵ-greedy strategy and Decentralized Training with Decentralized Execution (DTDE) framework. The experimental results show that the training process of the MARL method is convergent, efficient and stable. Compared with the Computer-Assisted Slot Allocation (CASA) method used in the actual operation, the number of flight delays and the average delay time is reduced by 33.7% and 36.7% respectively.