Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision

robot,ICRA 2023

Ashvin Nair,Brian Zhu,Gokul Narayanan,Eugen Solowjow,Sergey Levine,Ashvin Nair,Brian Zhu,Gokul Narayanan,Eugen Solowjow,Sergey Levine

Learning-based methods in robotics hold the promise of generalization, but what can be done if a learned policy does not generalize to a new situation? In principle, if an agent can at least evaluate its own success (i.e., with a reward classifier that generalizes well even when the policy does not), it could actively practice the task and finetune the policy in this situation. We study this probl...

Learning on the Job: Self-Rewarding Offline-to-Online Finetuning for Industrial Insertion of Novel Connectors from Vision

Ashvin Nair,Brian Zhu,Gokul Narayanan,Eugen Solowjow,Sergey Levine,Ashvin Nair,Brian Zhu,Gokul Narayanan,Eugen Solowjow,Sergey Levine

Discussion

Related Contents