Q-learning with Long-term Action-space Shaping to Model Complex Behavior for Autonomous Lane Changes
Gabriel Kalweit,Maria Huegle,Moritz Werling,Joschka Boedecker,Gabriel Kalweit,Maria Huegle,Moritz Werling,Joschka Boedecker
In autonomous driving applications, reinforcement learning agents often have to perform complex behavior, which can translate into optimizing multiple objectives while following certain rules. Encoding traffic rules and desires such as safety and comfort via classical methods based on reward shaping (i.e. a weighted combination of different objectives in the reward signal) or Lagrangian methods (i...