Probabilistically Guaranteed Satisfaction of Temporal Logic Constraints During Reinforcement Learning
Derya Aksaray,Yasin Yazıcıoğlu,Ahmet Semi Asarkaya,Derya Aksaray,Yasin Yazıcıoğlu,Ahmet Semi Asarkaya
We propose a novel constrained reinforcement learning method for finding optimal policies in Markov Decision Processes while satisfying temporal logic constraints with a desired probability throughout the learning process. An automata-theoretic approach is proposed to ensure the probabilistic satisfaction of the constraint in each episode, which is different from penalizing violations to achieve c...