Safe Continuous Control with Constrained Model-Based Policy Optimization
Moritz A. Zanger,Karam Daaboul,J. Marius Zöllner,Moritz A. Zanger,Karam Daaboul,J. Marius Zöllner
The applicability of reinforcement learning (RL) algorithms in real-world domains often requires adherence to safety constraints, a need difficult to address given the asymptotic nature of the classic RL optimization objective. In contrast to the traditional RL objective, safe exploration considers the maximization of expected returns under safety constraints expressed in expected cost returns. We...