Efficient Preference-Based Reinforcement Learning Using Learned Dynamics Models

Yi Liu,Gaurav Datta,Ellen Novoseller,Daniel S. Brown,Yi Liu,Gaurav Datta,Ellen Novoseller,Daniel S. Brown

Preference-based reinforcement learning (PbRL) can enable robots to learn to perform tasks based on an individual's preferences without requiring a hand-crafted re-ward function. However, existing approaches either assume access to a high-fidelity simulator or analytic model or take a model-free approach that requires extensive, possibly unsafe online environment interactions. In this paper, we st...