Learning Skills to Navigate without a Master: A Sequential Multi-Policy Reinforcement Learning Algorithm

Ambedkar Dukkipati,Rajarshi Banerjee,Ranga Shaarad Ayyagari,Dhaval Parmar Udaybhai,Ambedkar Dukkipati,Rajarshi Banerjee,Ranga Shaarad Ayyagari,Dhaval Parmar Udaybhai

Solving complex problems using reinforcement learning necessitates breaking down the problem into manageable tasks, and learning policies to solve these tasks. These policies, in turn, have to be controlled by a master policy that takes high-level decisions. Hence learning policies involves hierarchical decision structures. However, training such methods in practice may lead to poor generalization...