Mapping Instructions and Visual Observations to Actions with Reinforcement Learning
نویسندگان
چکیده
We propose to directly map raw visual observations and text input to actions for instruction execution. While existing approaches assume access to structured environment representations or use a pipeline of separately trained models, we learn a single model to jointly reason about linguistic and visual input. We use reinforcement learning in a contextual bandit setting to train a neural network agent. To guide the agent’s exploration, we use reward shaping with different forms of supervision. Our approach does not require intermediate representations, planning procedures, or training different models. We evaluate in a simulated environment, and show significant improvements over supervised learning and common reinforcement learning variants.
منابع مشابه
Reinforcement Learning for Mapping Instructions to Actions
In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function that defines the quality of the executed actions. During training, the learner repeatedly constructs action sequences for a set of documents, executes those actions, and observes the resulting reward. We use a policy grad...
متن کاملYoav Artzi Cornell University “ Mapping Instructions to Actions ”
An agent following instructions requires a robust understanding of language and its environment. In this talk I will describe two approaches to address the problem of mapping instructions to actions. First, I will describe a semantic parsing approach, where language is mapped to an intermediate formal representation. This method is interpretable and allows to explicitly model context-dependent ...
متن کاملInverse Reinforcement Learning for Following Instructions
Introduction In order to achieve higher levels of autonomy, robots need the ability to interact naturally with humans in unstructured environments. One of the most intuitive and flexible interaction modalities is to allow a human teammate to instruct a robot with natural language commands. In order to follow natural language directions, a robot needs to convert symbolic natural language instruc...
متن کاملCS229 Final Project: Language Grounding in Minecraft with Gated-Attention Networks
A key question in language understanding is the problem of language grounding – how do symbols such as words get their meaning? We examine this question in the context of task oriented language grounding in gameplay. In order to perform tasks and challenges specified by natural language instructions, agents need to extract semantically meaningful representations of language and map it to the vi...
متن کاملReinforcement Learning for Adaptive Routing
Reinforcement learning means learning a policy—a mapping of observations into actions— based on feedback from the environment. The learning can be viewed as browsing a set of policies while evaluating them by trial through interaction with the environment. We present an application of gradient ascent algorithm for reinforcement learning to a complex domain of packet routing in network communica...
متن کامل