Scalable POMDP Decision-Making Using Circulant Controllers
Kyle Hollins Wray,Kenneth Czuprynski,Kyle Hollins Wray,Kenneth Czuprynski
This paper presents a novel policy representation for partially observable Markov decision processes (POMDPs) called circulant controllers and a provably efficient gradient-based algorithm for them. A formal mathematical description is provided that leverages circulant matrices for the controller’s stochastic node transitions. This structure is particularly effective for capturing decision-making ...