Robust Model Predictive Shielding for Safe Reinforcement Learning with Stochastic Dynamics

Shuo Li,Osbert Bastani,Shuo Li,Osbert Bastani

We propose a framework for safe reinforcement learning that can handle stochastic nonlinear dynamical systems. We focus on the setting where the nominal dynamics are known, and are subject to additive stochastic disturbances with known distribution. Our goal is to ensure the safety of a control policy trained using reinforcement learning, e.g., in a simulated environment. We build on the idea of m...