Value-Informed Skill Chaining for Policy Learning of Long-Horizon Tasks with Surgical Robot

Tao Huang,Kai Chen,Wang Wei,Jianan Li,Yonghao Long,Qi Dou,Tao Huang,Kai Chen,Wang Wei,Jianan Li,Yonghao Long,Qi Dou

Reinforcement learning is still struggling with solving long-horizon surgical robot tasks which involve multiple steps over an extended duration of time due to the policy exploration challenge. Recent methods try to tackle this problem by skill chaining, in which the long-horizon task is decomposed into multiple subtasks for easing the exploration burden and subtask policies are temporally connect...