2048 What is a game?

2048 The game is shown below, and it consists of a 4*4 total of 16 blocks. The player can slide the blocks in four directions, “up, down, left and right”. When sliding, two adjacent blocks with the same value will merge, and the new block will be the sum of the two. When the value of any square in the game reaches 2048, it is victory.

We will use the “Monte Carlo method” to build 2048 AI.

What is the Monte Carlo method?

There are many problems, the mathematical formula is very complicated, even in a short time can not find the mathematical formula. Like the area of the irregular shape down here.

We can obtain practical approximations of the area of the irregular shape by means of a “statistical simulation”. To do this, 1) generate a number of randomly positioned points in the square; 2) Count the number of points in the irregular graph; 3) Calculate the ratio of the quantity obtained in Step 2 to the total number; 4) The ratio obtained by multiplying the area of the square by step 3 is an approximation of the area of the irregular shape.

The above approach is a typical Monte Carlo method. When the number of random points we generate is large enough, the approximate value we get is closer to the theoretical value, and the error is smaller. As shown in the figure below, the simulation process of monte Carlo method for finding the sector area in a square:

The two pictures above are just two applications of monte Carlo’s method. In fact, the Monte Carlo method has a very wide range of applicability, and can be used to simulate any proportion distribution that can be simulated and counted. Like checking whether the coin is balanced enough.

In theory, the probability of flipping a coin is equal, 50 percent each. However, the actual process, can not be absolutely uniform, there is always deviation. And to figure out whether this bias is in favor of heads or tails, you can use the Monte Carlo method. Toss a coin over and over again, and then count the ratio of heads to tails. When the number of flips is infinite, this ratio reflects the fairness of the coin. In reality, we can’t flip the coin an infinite number of times, so we can only estimate the fairness of the coin within some margin of error.

In a word, monte Carlo method, in practice, gives us this convenience: we can use simulation and statistics to replace the operation process of mathematical formula, and get a solution close to the theoretical value.

Monte Carlo method and 2048 games

We can apply the Monte Carlo method to a 2048 game.

For 2048 game of any state, there are “up and down” four directions can choose; Although sometimes going in a certain direction does not change the state of the disk, it is also an optional move because it is supported by the game and does not result in a loss.

This “up, down, left, right”, which direction is good, which direction is bad, what are their respective winning rates? We don’t know, but we do know that objectively there is a distribution. If you add up all four of them, they must equal 100 percent.

You can think of this “up, down, left and right” as a four-sided die, and uneven four-sided die; Or think of them as a positive direction divided into four pieces, and they’re not equal pieces. Do we have the “2048 formula” to apply? Can we just figure out the winning percentage in each direction? I’m not a mathematician, so I haven’t found it, but I know that the Monte Carlo method can estimate approximate solutions. So come and try it.

The extreme case of the Monte Carlo method is equivalent to brute force, where you go through each permutation and combination of the four directions, the four directions after the four directions, and the four directions after the four directions, knowing whether you win or lose; Then count the number of wins each time you go “up, down, left and right” and divide it by the total number of wins.

Is violence too harsh? it doesn’t matter If you simulate 400 times, you’ll probably get 90% accuracy, and if you do an infinite number of times, you’ll probably only get 100% accuracy from 90% accuracy.

Monte Carlo method code

As described by the Monte Carlo method.

The run method accepts a parameter named Iterations. The run method represents how many times to simulate. The run method is called Simulate.

After the simulation is complete, getBestAction retrieves the action with the highest score.

How to write simulate? Just keep picking a direction at random until you die. The board.getActions method returns an empty array in the event of a victory or defeat, indicating that there are no valid actions left for the player to take in the game. The while loop can then be freed.

Board.doaction is supposed to take the game to the next state. If the game steps are infinite, then we need to control how long the simulation takes, or the number of doactions, which can be omitted for non-infinite step games like 2048.

Clone a copy of the board to avoid affecting the current state of the game. If we don’t have a game simulator, the Monte Carlo method doesn’t come in handy.

The path array variable records the action sequence we simulated this time.

When a simulation is dead, the first action and the simulation result (win/loss 01 or score) are stored in the statistics table to accumulate. Why the first action? Since our goal is to find the next action in the current game, the simulated first action corresponds to the next action we’re actually going to do.

And then the last method updateStatistic, we just update the statistics. Its implementation is also very simple, is to determine whether the action already exists, exists to accumulate, does not exist to create.

Have you noticed that there is no 2048 limit in our code, but rather a board with clone,getActions, doAction, getResult and other highly abstract methods?

Yes, the Monte Carlo method we just implemented is not customized for 2048, but can be used in different board games, video games, or games related to sequence of steps. Just write an adapter that exports game state and actions to clone,getActions, doAction, getResult, etc.

2048 games fit monte Carlo method

With just a few lines of code, we can provide methods to fit 2048 board instances to our implemented “Monte Carlo method class.”

In getActions, determine whether 2048 board is currently hasWon or hasLost, if so, return an empty array, if not, return an array [0, 1, 2, 3] meaning “up, down, left, right”.

The result returned by getResult is that the score before simulation board.score is recorded as startScore. After simulation, when getResult is performed, the current board. score-startscore is combined to obtain the actual score earned by this simulation.

The doAction method simply calls the board.move direction. Why abstract as doAction instead of doMove? Because in some games, the action is not limited to movement, so move is too specific, action is more abstract, can represent more possible actions.

Once the adapter is written, you can output a method called getBestAction that simply inputs the current 2048 boards, simulates the Monte Carlo method 400 times, and returns the action with the highest statistical score as the next action.

Each step to run a monte carlo method, although repeat many times, but it doesn’t matter, as long as you keep up with the performance, repeat repeat, repeat bring more simulation times, also means that more accurate fitting the theoretical area distribution.

If 400 times simulation, accuracy is not enough, can be increased to 800 times, 2000 times, there is always an order of magnitude, can achieve satisfactory results.

The result of victory

The following is a screenshot of 2048 successfully reached after simulation on my machine. You can also watch the process on your own machine. Of course, it is best that you implement the Monte Carlo algorithm to reinforce the impression.

We have the opportunity to introduce the Monte Carlo tree search (MCTS) based on the Monte Carlo method, which is actually the structural optimization of the Monte Carlo method in programming, and the essence of the Monte Carlo method.

Click to view the original article, visit Github repository, you can find DEMO, welcome star.