Regularizing Action Policies for Smooth Control with Reinforcement Learning
Siddharth Mysore,Bassel Mabsout,Renato Mancuso,Kate Saenko,Siddharth Mysore,Bassel Mabsout,Renato Mancuso,Kate Saenko
A critical problem with the practical utility of controllers trained with deep Reinforcement Learning (RL) is the notable lack of smoothness in the actions learned by the RL policies. This trend often presents itself in the form of control signal oscillation and can result in poor control, high power consumption, and undue system wear. We introduce Conditioning for Action Policy Smoothness (CAPS),...