Memory-based Deep Reinforcement Learning for POMDPs
Lingheng Meng,Rob Gorbet,Dana Kulić,Lingheng Meng,Rob Gorbet,Dana Kulić
A promising characteristic of Deep Reinforcement Learning (DRL) is its capability to learn optimal policy in an end-to-end manner without relying on feature engineering. However, most approaches assume a fully observable state space, i.e. fully observable Markov Decision Processes (MDPs). In real-world robotics, this assumption is unpractical, because of issues such as sensor sensitivity limitatio...