Learning Reward Functions for Robotic Manipulation by Observing Humans

Minttu Alakuijala,Gabriel Dulac-Arnold,Julien Mairal,Jean Ponce,Cordelia Schmid,Minttu Alakuijala,Gabriel Dulac-Arnold,Julien Mairal,Jean Ponce,Cordelia Schmid

Observing a human demonstrator manipulate objects provides a rich, scalable and inexpensive source of data for learning robotic policies. However, transferring skills from human videos to a robotic manipulator poses several challenges, not least a difference in action and observation spaces. In this work, we use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-...