Open access
Author
Date
2021Type
- Doctoral Thesis
ETH Bibliography
yes
Altmetrics
Abstract
The increasing impact of data-driven technologies across various industries has sparked renewed interest in using learning-based approaches to automatically design and optimize control systems. While recent success stories from the field of reinforcement learning (RL) suggest an immense potential of such approaches, missing safety certificates still confine learning-based methods to simulation environments or fail-safe laboratory conditions. To this end, Part A of this dissertation introduces a predictive safety filter that allows to enhance existing, potentially unsafe learning-based controllers with safety guarantees. The underlying method is based on model predictive control (MPC) theory and ensures constraint satisfaction through an optimization-based safety mechanism that provides a safe backup control law at all times. To enable the efficient design of the proposed predictive safety filter from system data, this thesis extends available robustification methods from MPC to support diverse system classes through different model assumptions. This part of the thesis specifically introduces the core concepts for closed-loop chance constraint satisfaction using simple linear system models with data-driven uncertainties and learning-based linear model estimates with unbounded process noise. Moreover, uncertain system models with significant nonlinear effects are efficiently supported through a prediction mechanism, which exploits confident subsets of the state and input space. The further developments of these techniques are outlined in this thesis and additionally cover distributed systems and illustrate the predictive safety filter in a miniature racing application. Compared with existing safety frameworks based on control barrier function theory, predictive safety filters avoid the computationally difficult task to derive a control barrier function and thereby provide favorable scalability properties toward large-scale and distributed systems. Despite the seemingly different concepts of predictive safety filters and control barrier functions, this thesis establishes and formalizes the theoretical relations between the two approaches through a so-called ‘predictive control barrier function’, further enabling the recovery of infeasible nonlinear predictive control problems in an asymptotically stable fashion. While predictive safety filters offer a high degree of modularity in terms of safety and task-specific objectives, this separation can render a rigorous performance analysis a difficult task. To this end, Part B introduces specialized learning-based MPC controllers for accelerated learning towards a distinct goal. Even if the objective function is explicitly available, the design of an MPC controller requires an accurate prediction model, often in combination with a terminal constraint and objective function to compensate for short prediction horizons. Part B tackles the difficult design task of these components from three different angles. It first introduces a learning-based improvement of established and safe MPC controllers for asymptotic stabilization tasks through a stochastic tube-based MPC mechanism that supports probabilistic regression models. While this allows to take advantage of available system data for accurate predictions, insufficient prior knowledge or a deficient initial database requires additional mechanisms to efficiently acquire new data. To automate this identification process, the contributions of Part B continue with the question of how a controller can efficiently explore the system and when to transition from exploration to exploitation of available information. The proposed solution to these questions is based on posterior sampling theory and results in a computationally efficient active learning MPC formulation, which provides finite-time performance guarantees. The last contribution of this part addresses performance degenerations of an MPC controller caused by short prediction horizons, which are even present in the case of perfectly known prediction models. To overcome this limitation, Part B develops a data-driven mechanism to iteratively improve the terminal cost and terminal set of an MPC problem by leveraging system trajectories. During training, the proposed method efficiently handles model uncertainties and constraint violations to support learning-based prediction models and poorly performing initial controllers. This is achieved through a soft-constrained MPC formulation supporting polytopic state constraints. Show more
Permanent link
https://doi.org/10.3929/ethz-b-000534919Publication status
publishedExternal links
Search print copy at ETH Library
Contributors
Examiner: Zeilinger, Melanie N.
Examiner: Borelli, Francesco
Examiner: Trimpe, Sebastian
Publisher
ETH ZurichSubject
model predictive control (MPC); stochastic control; Safe learning-based control; Constrained controlOrganisational unit
09563 - Zeilinger, Melanie / Zeilinger, Melanie
More
Show all metadata
ETH Bibliography
yes
Altmetrics