Shaping Rewards for Reinforcement Learning with Imperfect Demonstrations using Generative Models

robot,ICRA 2021

Yuchen Wu,Melissa Mozifian,Florian Shkurti,Yuchen Wu,Melissa Mozifian,Florian Shkurti

The potential benefits of model-free reinforcement learning to real robotics systems are limited by its uninformed exploration that leads to slow convergence, lack of data-efficiency, and unnecessary interactions with the environment. To address these drawbacks we propose a method that combines reinforcement and imitation learning by shaping the reward function with a state-and-action-dependent po...

Shaping Rewards for Reinforcement Learning with Imperfect Demonstrations using Generative Models

Yuchen Wu,Melissa Mozifian,Florian Shkurti,Yuchen Wu,Melissa Mozifian,Florian Shkurti

Discussion

Related Contents