A Generalized Acquisition Function for Preference-based Reward Learning

Evan Ellis,Gaurav R. Ghosal,Stuart J. Russell,Anca Dragan,Erdem Bıyık,Evan Ellis,Gaurav R. Ghosal,Stuart J. Russell,Anca Dragan,Erdem Bıyık

Preference-based reward learning is a popular technique for teaching robots and autonomous systems how a human user wants them to perform a task. Previous works have shown that actively synthesizing preference queries to maximize information gain about the reward function parameters improves data efficiency. The information gain criterion focuses on precisely identifying all parameters of the rewa...