DisCo RL: Distribution-Conditioned Reinforcement Learning for General-Purpose Policies

Soroush Nasiriany,Vitchyr H. Pong,Ashvin Nair,Alexander Khazatsky,Glen Berseth,Sergey Levine,Soroush Nasiriany,Vitchyr H. Pong,Ashvin Nair,Alexander Khazatsky,Glen Berseth,Sergey Levine

Can we use reinforcement learning to learn general-purpose policies that can perform a wide range of different tasks, resulting in flexible and reusable skills? Contextual policies provide this capability in principle, but the representation of the context determines the degree of generalization and expressivity. Categorical contexts preclude generalization to entirely new tasks. Goal-conditioned ...