Dual Variable Actor-Critic for Adaptive Safe Reinforcement Learning

Junseo Lee,Jaeseok Heo,Dohyeong Kim,Gunmin Lee,Songhwai Oh,Junseo Lee,Jaeseok Heo,Dohyeong Kim,Gunmin Lee,Songhwai Oh

Satisfying safety constraints in reinforcement learning (RL) is an important issue, especially in real-world applications. Many studies have approached safe RL with the Lagrangian method, which introduces dual variables. However, applying a trained policy with the optimal dual variable to a new environment can be hazardous since the optimal value of the dual variable, which represents a level of s...