Learning Stabilization Control from Observations by Learning Lyapunov-like Proxy Models

Milan Ganai,Chiaki Hirayama,Ya-Chien Chang,Sicun Gao,Milan Ganai,Chiaki Hirayama,Ya-Chien Chang,Sicun Gao

The deployment of Reinforcement Learning to robotics applications faces the difficulty of reward engineering. Therefore, approaches have focused on creating reward functions by Learning from Observations (LfO) which is the task of learning policies from expert trajectories that only contain state sequences. We propose new methods for LfO for the important class of continuous control problems of le...