Improving Safety in Deep Reinforcement Learning using Unsupervised Action Planning
Hao-Lun Hsu,Qiuhua Huang,Sehoon Ha,Hao-Lun Hsu,Qiuhua Huang,Sehoon Ha
One of the key challenges to deep reinforcement learning (deep RL) is to ensure safety at both training and testing phases. In this work, we propose a novel technique of unsupervised action planning to improve the safety of on-policy reinforcement learning algorithms, such as trust region policy optimization (TRPO) or proximal policy optimization (PPO). We design our safety-aware reinforcement lea...