Safe online learning for nonlinear dynamical systems using control contraction metrics

Date published

2023-03

Free to read from

2024-09-05

Journal Title

Journal ISSN

Volume Title

Publisher

Cranfield University

Department

SATM

Type

Thesis

ISSN

Format

Citation

Abstract

This thesis aims to develop an online learning framework for a military fixed-wing aircraft that can adapt a control policy to unforeseen changes in the airframe’s flight dynamics. This is an active area of research and a significant challenge for high dimensional non linear systems due to the inherent safety risks and computational challenges of solving exponential time algorithms. The research achieves this aim by providing an extensive survey of safe online learn ing approaches for nonlinear systems by assessing each technique with the aid of key performance metrics. Two critical performance metrics are the reliability and time com plexity of the approach taken. To support the survey a benchmarking study of salient techniques provides further evidence to support the findings of the literature survey to identify promising avenues of research. A generic safe learning process is defined and a convex optimisation learning pipeline is developed to handle nonlinear system identification and online controller synthesis via control contraction metrics. The developed pipeline is applied to a longitudinal simulation of an F-16 aircraft using high-fidelity wind tunnel data. The gap in knowledge around the application of control contraction metrics to aircraft designed to meet flying qualities requirements based on linear time invariant theory is bridged. A novel cascaded two loop algorithm is developed to explicitly place the eigenvalues of the inner and outer loop of a differential feedback controller. Further a parameterisa tion of a robust controller is shown to better optimise the performance relative to flying qualities specifications. Conditions for a hybrid linear sum of controllers is shown to provide a stability guarantee for a mixed controller that enables a performance trade-off of each approach. The performance of the developed controllers is demonstrated on six damaged aircraft profiles to assess the robustness and transient characteristics for each method based on a forty second flight trajectory. The variation in nonlinear damage profiles illustrates the limitations of a linear ap proximating function for three nonlinear deviations. We show that the robust quadratic regulator controller generates a smoother transient response compared to the exponential contraction controllers. The two-loop contraction metric controller improves the rise-time performance compared to a single loop but is less robust to damage variations. The out come of the research is a greater understanding of the application of contraction-based controllers and the effect of tuning parameters for a robust controller with potential for a reinforcement learning algorithm. Further a method to hybridise control policies is proposed and a loop shaping method using contraction based linear matrix inequalities developed with potential application to cascaded systems.

Description

Software Description

Software Language

Github

Keywords

Safe Learning, Model Based Reinforcement Learning, Adaptive Control, Nonlinear Sys tems, Control Contraction Metrics, Fixed-Wing Control, Convex Optimisation.

DOI

Rights

© Cranfield University, 2023. All rights reserved. No part of this publication may be reproduced without the written permission of the copyright holder.

Relationships

Relationships

Supplements

Funder/s

Engineering and Physical Sciences Research Council (EPSRC)