Safe POMDP Online Planning via Shielding

Shili Sheng,David Parker,Lu Feng,Shili Sheng,David Parker,Lu Feng

Partially observable Markov decision processes (POMDPs) have been widely used in many robotic applications for sequential decision-making under uncertainty. POMDP online planning algorithms such as Partially Observable Monte-Carlo Planning (POMCP) can solve very large POMDPs with the goal of maximizing the expected return. But the resulting policies cannot provide safety guarantees which are imper...