Reinforcement Learning Based Temporal Logic Control with Maximum Probabilistic Satisfaction
Mingyu Cai,Shaoping Xiao,Baoluo Li,Zhiliang Li,Zhen Kan,Mingyu Cai,Shaoping Xiao,Baoluo Li,Zhiliang Li,Zhen Kan
This paper presents a model-free reinforcement learning (RL) algorithm to synthesize a control policy that maximizes the satisfaction probability of complex tasks, which are expressed by linear temporal logic (LTL) specifications. Due to the consideration of environment and motion uncertainties, we model the robot motion as a probabilistic labeled Markov decision process (PL-MDP) with unknown tran...