Reward inference of discrete-time expert's controllers: A complementary learning approach

dc.contributor.authorPerrusquía, Adolfo
dc.contributor.authorGuo, Weisi
dc.date.accessioned2023-03-24T13:33:12Z
dc.date.available2023-03-24T13:33:12Z
dc.date.issued2023-03-06
dc.description.abstractUncovering the reward function of optimal controllers is crucial to determine the desired performance that an expert wants to inject to a certain dynamical system. In this paper, a reward inference algorithm of discrete-time expert's controllers is proposed. The approach is inspired by the complementary mechanisms of the striatum, neocortex, and hippocampus for decision making and experience transference. These systems work together to infer the reward function associated to expert's controller using the complementary merits of data-driven and online learning methods. The proposed approach models the neocortex system as two independent learning algorithms given by a Q-learning algorithm and a gradient identification rule. The hippocampus is modelled by a least-squares update rule that extracts the relation from the states and control inputs of the expert's data. The striatum is modelled by an inverse optimal control algorithm which iteratively finds the hidden reward function. Lyapunov stability theory is used to show the stability and convergence of the proposed approach. Simulation studies are given to demonstrate the effectiveness of the proposed complementary learning algorithm.en_UK
dc.identifier.citationPerrusquia A, Guo W. (2023) Reward inference of discrete-time expert’s controllers: a complementary learning approach, Information Sciences. 631, June 2023, pp. 396-411en_UK
dc.identifier.issn0020-0255
dc.identifier.urihttps://doi.org/10.1016/j.ins.2023.02.079
dc.identifier.urihttps://dspace.lib.cranfield.ac.uk/handle/1826/19350
dc.language.isoenen_UK
dc.publisherElsevieren_UK
dc.rightsAttribution 4.0 International*
dc.rights.urihttp://creativecommons.org/licenses/by/4.0/*
dc.subjectDiscrete-time linear systemsen_UK
dc.subjectHippocampusen_UK
dc.subjectNeocortexen_UK
dc.subjectStriatumen_UK
dc.subjectComplementary learningen_UK
dc.subjectGradient update ruleen_UK
dc.titleReward inference of discrete-time expert's controllers: A complementary learning approachen_UK
dc.typeArticleen_UK

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Reward_inference_of_discrete-time_experts_controllers-2023.pdf
Size:
1.22 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.63 KB
Format:
Item-specific license agreed upon to submission
Description: