Safe Continuous Control with Constrained Model-Based Policy Optimization

Moritz A. Zanger,Karam Daaboul,J. Marius Zöllner,Moritz A. Zanger,Karam Daaboul,J. Marius Zöllner

The applicability of reinforcement learning (RL) algorithms in real-world domains often requires adherence to safety constraints, a need difficult to address given the asymptotic nature of the classic RL optimization objective. In contrast to the traditional RL objective, safe exploration considers the maximization of expected returns under safety constraints expressed in expected cost returns. We...