Learning to Play the Worker-Placement Game Euphoria using Neural Fitted Q Iteration
نویسنده
چکیده
We design and implement an agent for the popular worker placement and resource management game Euphoria using Neural Fitted Q Iteration (NFQ), a reinforcement learning algorithm that uses an artificial neural network for the action-value function which is updated off-line considering a sequence of training experiences rather than online as in typical Q-learning. We find that the agent is able to improve its performance against a random agent after only a relatively small number of games.
منابع مشابه
Deep Reinforcement Learning with Regularized Convolutional Neural Fitted Q Iteration
We review the deep reinforcement learning setting, in which an agent receiving high-dimensional input from an environment learns a control policy without supervision using multilayer neural networks. We then extend the Neural Fitted Q Iteration value-based reinforcement learning algorithm (Riedmiller et al) by introducing a novel variation which we call Regularized Convolutional Neural Fitted Q...
متن کاملAn Empirical Comparison of Neural Architectures for Reinforcement Learning in Partially Observable Environments
This paper explores the performance of fitted neural Q iteration for reinforcement learning in several partially observable environments, using three recurrent neural network architectures: Long ShortTerm Memory [7], Gated Recurrent Unit [3] and MUT1, a recurrent neural architecture evolved from a pool of several thousands candidate architectures [8]. A variant of fitted Q iteration, based on A...
متن کاملCS229 Final Report Deep Q-Learning to Play Mario
In this paper, I study applying applying and adjusting DeepMind’s Atari Deep Q-Learning model to train an automatic agent to play the 1985 Nintendo game Super Mario Bros. The agent learns control policies from raw pixel data using deep reinforcement learning. The model is a convolutional neural network that trained through only raw frames of the game and basic info such as score and motion.
متن کاملGame-based Teaching of Stress Placement on Multi-syllabic English Words
Accurate pronunciation is an important component of language ability and the main outward linguistic sign of whether someone is a native speaker of a language or not. An area of particular difficulty for Persian-speaking learners of English, which may cause 'foreign accent' or misunderstanding in speaking, is placement of stress on multi-syllable words. Game-based pronunciation teaching can be ...
متن کاملQ-Batch: initial results with a novel update rule for Batch Reinforcement Learning
Batch Reinforcement Learning has established itself as a valuable alternative to develop learning and adaptive agents. Batch Reinforcement Learning algorithms are characterized by obtaining a policy from a set of collected data. Common methods apply adapted versions of RL update rules, such as QLearning, on the transitions of the batch, building a pattern set. The target values of the pattern r...
متن کامل