Nash double Q-based multi-agent deep reinforcement learning for interactive merging strategy in mixed traffic
Date published
Free to read from
Supervisor/s
Journal Title
Journal ISSN
Volume Title
Publisher
Department
Type
ISSN
Format
Citation
Abstract
The interaction between ramp and mainline vehicles plays a crucial role in merging areas, especially in the mixed-traffic environment. The driving behaviours of human drivers are uncertain and diverse, and the uncertainty makes it more complex for connected automated vehicles (CAV) to plan trajectories and merge into the mainline. To overcome this problem, a interactive merging strategy based on multi-agent deep reinforcement learning (MADRL) is designed, enabling the ramp vehicle (CAV) to consider the dynamic reaction of mainline vehicles. There are two agents in our interactive strategy, one of which is to predict and analyse the behaviour of mainline vehicles (human-driven vehicles, HDV, or non-connected vehicles). The other is created for exploring optimal merging actions of ramp vehicles. Firstly, game theory is used to model the competitive behaviours between ramp and mainline vehicles, and the Nash equilibrium of joint actions guides the ramp vehicle to learn best response to the mainline vehicle. Secondly, the Nash double Q algorithm is developed to ensure the outputs of Q networks are trained to efficiently converge to the Nash equilibrium point. The trained Q networks are then used for online control. Finally, our strategy is compared with single RL and existing MADRL algorithms in real on-ramp scenarios. Simulations show our strategy to be successful in coordinating both vehicles via analysis of human drivers, resulting in improved driving performance in terms of global safety, efficiency, and comfort.