Rank2Reward: Learning Shaped Reward Functions from Passive Video
Daniel Yang,Davin Tjia,Jacob Berg,Dima Damen,Pulkit Agrawal,Abhishek Gupta,Daniel Yang,Davin Tjia,Jacob Berg,Dima Damen,Pulkit Agrawal,Abhishek Gupta
Teaching robots novel skills with demonstrations via human-in-the-loop data collection techniques like kinesthetic teaching or teleoperation puts a heavy burden on human supervisors. In contrast to this paradigm, it is often significantly easier to provide raw, action-free visual data of tasks being performed. Moreover, this data can even be mined from video datasets or the web. Ideally, this data...