POLITE: Preferences Combined with Highlights in Reinforcement Learning

robot,ICRA 2024

Simon Holk,Daniel Marta,Iolanda Leite,Simon Holk,Daniel Marta,Iolanda Leite

Many solutions to address the challenge of robot learning have been devised, namely through exploring novel ways for humans to communicate complex goals and tasks in reinforcement learning (RL) setups. One way that experienced recent research interest directly addresses the problem by considering human feedback as preferences between pairs of trajectories (sequences of state-action pairs). However...

POLITE: Preferences Combined with Highlights in Reinforcement Learning

Simon Holk,Daniel Marta,Iolanda Leite,Simon Holk,Daniel Marta,Iolanda Leite

Discussion

Related Contents