Follow The Rules: Online Signal Temporal Logic Tree Search for Guided Imitation Learning in Stochastic Domains

Jasmine Jerry Aloor,Jay Patrikar,Parv Kapoor,Jean Oh,Sebastian Scherer,Jasmine Jerry Aloor,Jay Patrikar,Parv Kapoor,Jean Oh,Sebastian Scherer

Seamlessly integrating rules in Learning-from-Demonstrations (LfD) policies is a critical requirement to enable the real-world deployment of AI agents. Recently, Signal Temporal Logic (STL) has been shown to be an effective language for encoding rules as spatio-temporal constraints. This work uses Monte Carlo Tree Search (MCTS) as a means of integrating STL specification into a vanilla LfD policy ...