Bayesian Residual Policy Optimization: : Scalable Bayesian Reinforcement Learning with Clairvoyant Experts

Gilwoo Lee,Brian Hou,Sanjiban Choudhury,Siddhartha S. Srinivasa,Gilwoo Lee,Brian Hou,Sanjiban Choudhury,Siddhartha S. Srinivasa

Informed and robust decision making in the face of uncertainty is critical for robots operating in unstructured environments. We formulate this as Bayesian Reinforcement Learning over latent Markov Decision Processes (MDPs). While Bayes-optimality is theoretically the gold standard, existing algorithms scale poorly to continuous state and action spaces. We build on the following insight: in the ab...