Reward inference of discrete-time expert's controllers: A complementary learning approach

Date published

2023-03-06

Free to read from

Supervisor/s

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier

Department

Type

Article

ISSN

0020-0255

Format

Citation

Perrusquia A, Guo W. (2023) Reward inference of discrete-time expert’s controllers: a complementary learning approach, Information Sciences. 631, June 2023, pp. 396-411

Abstract

Uncovering the reward function of optimal controllers is crucial to determine the desired performance that an expert wants to inject to a certain dynamical system. In this paper, a reward inference algorithm of discrete-time expert's controllers is proposed. The approach is inspired by the complementary mechanisms of the striatum, neocortex, and hippocampus for decision making and experience transference. These systems work together to infer the reward function associated to expert's controller using the complementary merits of data-driven and online learning methods. The proposed approach models the neocortex system as two independent learning algorithms given by a Q-learning algorithm and a gradient identification rule. The hippocampus is modelled by a least-squares update rule that extracts the relation from the states and control inputs of the expert's data. The striatum is modelled by an inverse optimal control algorithm which iteratively finds the hidden reward function. Lyapunov stability theory is used to show the stability and convergence of the proposed approach. Simulation studies are given to demonstrate the effectiveness of the proposed complementary learning algorithm.

Description

Software Description

Software Language

Github

Keywords

Discrete-time linear systems, Hippocampus, Neocortex, Striatum, Complementary learning, Gradient update rule

DOI

Rights

Attribution 4.0 International

Relationships

Relationships

Resources

Funder/s