Value learning from trajectory optimization and Sobolev descent: A step toward reinforcement learning with superlinear convergence properties

robot,ICRA 2022

Amit Parag,Sébastien Kleff,Léo Saci,Nicolas Mansard,Olivier Stasse,Amit Parag,Sébastien Kleff,Léo Saci,Nicolas Mansard,Olivier Stasse

The recent successes in deep reinforcement learning largely rely on the capabilities of generating masses of data, which in turn implies the use of a simulator. In particular, current progress in multi body dynamic simulators are under-pinning the implementation of reinforcement learning for end-to-end control of robotic systems. Yet simulators are mostly considered as black boxes while we have th...

Value learning from trajectory optimization and Sobolev descent: A step toward reinforcement learning with superlinear convergence properties

Amit Parag,Sébastien Kleff,Léo Saci,Nicolas Mansard,Olivier Stasse,Amit Parag,Sébastien Kleff,Léo Saci,Nicolas Mansard,Olivier Stasse

Discussion

Related Contents