IQL-TD-MPC: Implicit Q-Learning for Hierarchical Model Predictive Control

Rohan Chitnis,Yingchen Xu,Bobak Hashemi,Lucas Lehnert,Urun Dogan,Zheqing Zhu,Olivier Delalleau,Rohan Chitnis,Yingchen Xu,Bobak Hashemi,Lucas Lehnert,Urun Dogan,Zheqing Zhu,Olivier Delalleau

Model-based reinforcement learning (RL) has shown great promise due to its sample efficiency, but still struggles with long-horizon sparse-reward tasks, especially in offline settings where the agent learns from a fixed dataset. We hypothesize that model-based RL agents struggle in these environments due to a lack of long-term planning capabilities, and that planning in a temporally abstract model...