Dynamic Mirror Descent based Model Predictive Control for Accelerating Robot Learning
Utkarsh A. Mishra,Soumya R. Samineni,Prakhar Goel,Chandravaran Kunjeti,Himanshu Lodha,Aman Singh,Aditya Sagi,Shalabh Bhatnagar,Shishir Kolathaya,Utkarsh A. Mishra,Soumya R. Samineni,Prakhar Goel,Chandravaran Kunjeti,Himanshu Lodha,Aman Singh,Aditya Sagi,Shalabh Bhatnagar,Shishir Kolathaya
Recent works in Reinforcement Learning (RL) combine model-free (Mf)-RL algorithms with model-based (Mb)-RL approaches to get the best from both: asymptotic performance of Mf-RL and high sample-efficiency of Mb-RL. Inspired by these works, we propose a hierarchical framework that integrates online learning for the Mb-trajectory optimization with off-policy methods for the Mf-RL. In particular, two ...