Scalable Model-based Policy Optimization for Decentralized Networked Systems

Yali Du,Chengdong Ma,Yuchen Liu,Runji Lin,Hao Dong,Jun Wang,Yaodong Yang,Yali Du,Chengdong Ma,Yuchen Liu,Runji Lin,Hao Dong,Jun Wang,Yaodong Yang

Reinforcement learning algorithms require a large amount of samples; this often limits their real-world applications on even simple tasks. Such a challenge is more outstanding in multi-agent tasks, as each step of operation is more costly, requiring communications or shifting or resources. This work aims to improve data efficiency of multi-agent control by model-based learning. We consider network...