SafeTAC: Safe Tsallis Actor-Critic Reinforcement Learning for Safer Exploration

robot,IROS 2022

Dohyeong Kim,Jaeseok Heo,Songhwai Oh,Dohyeong Kim,Jaeseok Heo,Songhwai Oh

Satisfying safety constraints is the top priority in safe reinforcement learning (RL). However, without proper exploration, an overly conservative policy such as freezing at the same position can be generated. To this end, we utilize maximum entropy RL methods for exploration. In particular, an RL method with Tsallis entropy maximization, called Tsallis actor-critic (TAC), is used to synthesize po...

SafeTAC: Safe Tsallis Actor-Critic Reinforcement Learning for Safer Exploration

Dohyeong Kim,Jaeseok Heo,Songhwai Oh,Dohyeong Kim,Jaeseok Heo,Songhwai Oh

Discussion

Related Contents