Learning to Plan with Logical Automata

Brandon Araki, Kiran Vodrahalli, Thomas Leech, Cristian Ioan Vasile, Mark Donahue, and Daniela Rus. Learning to Plan with Logical Automata. In Robotics: Science and Systems Conference (RSS), pages 1–9, Messe Freiburg, Germany, June 2019. link.

Published date: 
Saturday, June 22, 2019

This paper introduces the \textit{Logic-based Value Iteration Network} (LVIN) framework, which combines imitation learning and logical automata to enable agents to learn complex behaviours from demonstrations. We address two problems with learning from expert knowledge: (1) how to generalize learned policies for a task to larger classes of tasks, and (2) how to account for erroneous demonstrations. Our LVIN model solves finite gridworld environments by instantiating a recurrent, convolutional neural network as a value iteration procedure over a learned Markov Decision Process (MDP) that factors into two MDPs: a small finite state automaton (FSA) corresponding to logical rules, and a larger MDP corresponding to motions in the environment. The parameters of LVIN (value function, reward map, FSA transitions, large MDP transitions) are approximately learned from expert trajectories.