No-Regret Shannon Entropy Regularized Neural Contextual Bandit Online Learning for Robotic Grasping

robot,IROS 2020

Kyungjae Lee,Jaegu Choy,Yunho Choi,Hogun Kee,Songhwai Oh,Kyungjae Lee,Jaegu Choy,Yunho Choi,Hogun Kee,Songhwai Oh

In this paper, we propose a novel contextual bandit algorithm that employs a neural network as a reward estimator and utilizes Shannon entropy regularization to encourage exploration, which is called Shannon entropy regularized neural contextual bandits (SERN). In many learning-based algorithms for robotic grasping, the lack of the real-world data hampers the generalization performance of a model ...

No-Regret Shannon Entropy Regularized Neural Contextual Bandit Online Learning for Robotic Grasping

Kyungjae Lee,Jaegu Choy,Yunho Choi,Hogun Kee,Songhwai Oh,Kyungjae Lee,Jaegu Choy,Yunho Choi,Hogun Kee,Songhwai Oh

Discussion

Related Contents