L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning
Taisuke Kobayashi,Taisuke Kobayashi
This paper proposes a new regularization technique for reinforcement learning (RL) towards making policy and value functions smooth and stable. RL is known for the instability of the learning process and the sensitivity of the acquired policy to noise. Several methods have been proposed to resolve these problems, and in summary, the smoothness of policy and value functions learned mainly in RL con...