Slides from the workshop presentations are now available.
Results are now available. Congratulations to our winners.
Testing round closed. Thank you to all of our competitors!
Updated Testing application (R15) is now available HERE.
Proving application is now available HERE.
The rules, schedule, and prizes have been announced.
GAME ON! The software is now available.
Sign up for our mailing list to receive important announcements about the competition.
Tetris is a falling-blocks puzzle video game originally designed and programmed by Alexey Pajitnov in 1985. In Tetris, a pseudorandom sequence of tetrominoes (sometimes called "tetrads" in older versions) - shapes composed of four square blocks each - fall down the playing field. The object of the game is to manipulate these tetrominoes by moving each one sideways and rotating it by 90 degree units, with the aim of creating a horizontal line of blocks without gaps. When such a line is created, it disappears, and the blocks above (if any) fall. The game ends when the player "tops out", that is, when the stack of tetrominoes reaches the top of the playing field and no new tetrominoes are able to enter. Despite its simple rules, playing tetris well requires a complex strategy and lots of experience. Furthermore, Demaine et al. have shown that Tetris is hard in a mathematical sense as well [Demaine et al., 2003]: finding the optimal strategy is NP-hard even if the sequence of tetrominoes is known in advance. These properties make Tetris an appealing benchmark problem for testing reinforcement learning (and other machine learning) algorithms.
Tetris was first formalized as a Markov decision process by Tsitsiklis and Van Roy (1996). They achieved surprising perfomance using a simple linear architecture and a small set of expert game features. There have been several studies on applying reinforcement learning methods to Tetris, but none have significantly outperformed Tsitsiklis and Van Roy's results (see [Ramon and Driessens, 2004], [Bertsekas and Tsitsiklis, 1996], [Lagoudakis et al., 2002], [Farias and van Roy, 2006] and [Kakade, 2001]). Recent work by Szita & Lorincz (2006) presented result from a cross entropy method that are several orders of magnitude better than Tsitsiklis and Van Roy. The winners of the 2008 RL Competition, Bruno Scherrer, Christophe Thiery, and Amine Boumaza topped that [Thiéry, 2007]. Is it possible to create a reinforcement learning agent for the 2009 Competition that is even better?
The competition domain is based on the Van Roy (1995) specification of the Tetris problem. There are some differences, however:
It is easy! in the /public/agents/tetrisAgentJava directory, you will find a simple Java agent that plays Tetris. It does no reinforcement learning (or any kind of learning), and it does not play well at all. However, it does contain a few useful routines for processing the board, testing tetromino placements, etc., along with some comments. All you have to do is to plug your favorite RL algorithm in. The readme.txt file in the agent's directory contains more detailed info about the usage of te sample agent.
Observation Space: high dimensional, discrete valued containing:
Action Space: 1 dimensional, discrete valued
Rewards: positive function of rows eliminated on each step. Scoring possibilites for different Tetris parametrizations may be wildly different. To give an equal weight to each MDP in tcalculating the final score, rewards are normalized: a benchmark agent reaches about 100,000 points on each MDP. You do not have to care about normalization: the real-valued reward you get from the environment is exactly what is added up to get the final score of the agent.
Note: the competition software will provide your agent with a task specification string that describes the basic inputs and outputs of the particular problem instance your agent is facing. For the competition, the ranges provided in task specification may not be tight; they provide a rough approximation of the actual observation and action ranges. More documentation of the the task specification string can be found here .