Learning Tetris Using the Noisy Cross-Entropy Method

نویسندگان

  • István Szita
  • András Lörincz
چکیده

The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Cross-Entropy Method for Reinforcement Learning

Reinforcement Learning methods have been succesfully applied to various optimalization problems. Scaling this up to real world sized problems has however been more of a problem. In this research we apply Reinforcement Learning to the game of Tetris which has a very large state space. We not only try to learn policies for Standard Tetris but try to learn parameterized policies for Generalized Te...

متن کامل

Notes Improvements on Learning Tetris with Cross-entropy

For playing the game of Tetris well, training a controller by the cross-entropy method seems to be a viable way (Szita and Lőrincz, 2006; Thiery and Scherrer, 2009). We consider this method to tune an evaluation-based one-piece controller as suggested by Szita and Lőrincz and we introduce some improvements. In this context, we discuss the influence of the noise, and we perform experiments with ...

متن کامل

Improvements on Learning Tetris with Cross Entropy

For playing the game of Tetris well, training a controller by the cross-entropy method seems to be a viable way (Szita and Lőrincz, 2006; Thiery and Scherrer, 2009). We consider this method to tune an evaluation-based one-piece controller as suggested by Szita and Lőrincz and we introduce some improvements. In this context, we discuss the influence of the noise, and we perform experiments with ...

متن کامل

Tetris-: Exploring Human Performance via Cross Entropy Reinforcement Learning Models

What can a machine learning simulation tell us about human performance in a complex, real-time task such as TetrisTM? Although Tetris is often used as a research tool (Mayer, 2014), the strategies and methods used by Tetris players have seldom been the explicit focus of study. In Study 1, we use cross-entropy reinforcement learning (CERL) (Szita & Lorincz, 2006; Thiery & Scherrer, 2009) to expl...

متن کامل

Approximate Dynamic Programming Finally Performs Well in the Game of Tetris

Tetris is a video game that has been widely used as a benchmark for various optimization techniques including approximate dynamic programming (ADP) algorithms. A look at the literature of this game shows that while ADP algorithms that have been (almost) entirely based on approximating the value function (value function based) have performed poorly in Tetris, the methods that search directly in ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Neural computation

دوره 18 12  شماره 

صفحات  -

تاریخ انتشار 2006