A Marginal Log-Likelihood Approach for the Estimation of Discount Factors of Multiple Experts in Inverse Reinforcement Learning
Babatunde H. Giwa,Chi-Guhn Lee,Babatunde H. Giwa,Chi-Guhn Lee
We focus on multiple experts performing a task in a Markov decision process (MDP) environment. A probabilistic assignment of trajectories to clusters and a mathematical framework which leverages the utility function are employed to jointly estimate the discount factor and reward. We treat the number of clusters as a hyperparameter which can be "freely" selected by the problem designer. In this wor...