PokerChip Race | Top CodinGamers Code reviews

STAY CONNECTED, FOLLOW CODINGAME NOW


The winners of the PokerChip Race contest, Xylo and Gamoul, share with us their insights about their code and strategies.

NB: If you took part in PokerChip Race and wish to share your own strategy with the CodinGame community, it would be a pleasure! Just use the forum :)


Xylo (1st place, C, France):

 

For this challenge, I forced myself to keep the code as simple as possible. I didn’t have a lot of time to spend, I didn’t want to make the same mistakes that I had made in the previous challenge, and I didn’t want to overwhelm myself with my own complex code (all relative). Fortunately, PokerChip Race was, in my opinion, easier than Game of Drones in that accelerating a chip came with a price. Most of the time, the chips didn’t accelerate. Because of this observation, I came up with three simple hypotheses:

1. In each current round, I could suppose that the chips would accelerate in whichever way was best
2. I could suppose that the chips wouldn’t accelerate at all
3. I could suppose that the opposing chips would never accelerate, even in the current round.

The 2nd and 3rd hypotheses can seem aggressive, but in practice they worked pretty well. I had tried at one point to mitigate the 3rd by taking into account the (either current or future) proximity of my chips to the larger opposing chips, but the result was worse. My AI was scared of the proximity of the larger chips which often made it reluctant to capture small chips or droplets even though in most cases it could have gotten away with it. In the end, I kept my 2nd and 3rd hypothesis as they were, even if my AI occasionally acted somewhat risky.
Once there, the rest came naturally: it sufficed to just take a set of strategies dictating my chips’ actions at each current turn and to evaluate them one after the other while simulating what would happen if my hypotheses were still valid. In each of those simulations, I assigned a scored based on the mass gained or lost by my chips. Then, I took the strategy that had the highest score. Easy, right? Anyway, implementing this method required some discipline to avoid certain pitfalls.



Simulation


One of the first steps of the implementation was to simulate the sequence of many turns. It was a little more difficult than it first seemed because all of the rules of implementation were somewhat more complicated than in Game of Drones and in Tron Battle. Fortunately, the thread in the forum called “PokerChip Race Physics Explanations” contained nearly all of the necessary explanations, and it worked well to post a question in order to get a quick and precise response from Loïc. I don’t think that I would have been able to put my strategy into action without his information. So thank you very much, Loïc!

Concerning the architecture, I defined a structure called state_t that contained the state of the game at any given moment as well as a function state_update that updated this state by simulating the flow of the game on a turn. Following a message from Manwe, Loïc had clarified that on each turn the server side contained a loop simulation of 500 steps with crash tests at each step. I carefully re-implemented the same mechanism on my side. In order to validate this part, I created two instances of state_t: one corresponding to the state I was reading at each turn from stdin (via a function called state_rate) and the other one was a copy of the initial state but it was updated at every turn with state_update instead of state_read. Then I temporarily programmed my AI to perform very simple and deterministic actions while ensuring that my prediction that the opponents would follow its strategy. On each turn, I compared my two states concerning the number, the position, the speed, and the size of the entities, and I sent all the differences that I found to stderr. After all the problems were fixed, I could simulate hundreds of rounds, thus tens of thousands of iterations, without the two states differing considerably.


Score Calculation


The assigned score from a strategy came from an assessment of 7 turns. One of the current limitations of my AI was that if one of my chips was stopped, it was not going to accelerate to take a smaller entity that was more than 7×200/14 = 100 units away. This was sometimes problematic.
For each of the seven turns of assessment, I assigned a sub-score corresponding to the sum of the total mass of the chips and two times the mass of my biggest chip. This heuristic allowed it to concentrate on scenarios where my chips were gathered together. The final score was the sum of these sub-scores, weighted by 0.7^n – where n was the number of the turn in question from 0 to 6 – in order to favor scenarios where I gain mass quickly. My 2nd and 3rd hypotheses had their limitations and the more time that passed the more chances there were that the opponent would move or that I myself would be interested in moving. Via this inclination, the sub-scores that corresponded to the most probable developments had the most impact on the final score.


Strategies


To determine the actual acceleration of my chips, I used a probability method. At each turn, I tried the following strategies:
+ Do nothing
+ Converge all my chips to their equally weighted barycenter (rarely worked)
+ Alter the best strategy found so far while accelerating a chip randomly (repeated 40 times and worked well)
+ Alter the best strategy found so far by simulated annealing (5 iterations)

In practice, we can consider this process as one large simulated annealing in which we stay a long time at a maximum temperature (c. – a – d during 40 iterations where the chips accelerated in a totally random way). The most effective actions used by my AI (“double propulsion” while using a wall, absorbing an opponent via a rebounded droplet) were simply those kept in the end of the 50 strategies randomly determined and tested at each turn.

Voila, it’s done. Now, to move the bot into the arena.


It’s too slow


…and it timed out. The following was not surprising: 7×500=3500 simulation steps for each of the strategies, while verifying at each step if there would be a collision between all the possible pairs of entities. Let’s assume that there are 20 entities and for fifty strategies, that’s 50×3500×(20x19)/2 = too many crash tests. Just to let you know, it was possible to evaluate 3 strategies per turn with the AI such that I have described up until now. When I had tried to use it in the arena, I think it that it could have only gotten around 20-something place.

From what I could read on the forum, a lot of participants used analytical methods to calculate the date of the closest collision, and this is exactly what needed to be done. With that being said, it was necessary to round up the date to the simulation step up to the next one for maximum precision and redo the tests and calculations at this rounded-up date.

In this way, there was no loss concerning precision of the simulation, and it was possible to see a gain in performance in the order of ×20 to reach the 50 tested strategies on each turn that I was aiming for. I think that this was really the true key to this challenge. It was important not to underestimate the importance of having quick and reliable predictions during the game.


Winamax Open Day


In some ways, my strategy had limitations during the Winimax Open Day. During this event, the rules were the same as during the two week challenge except that the when the chips touched the edge of the table they were lost instead of bouncing back. Therefore, I was presented with three issues:

1) I had to change the function of calculating the score to ensure the situations were favored when my chips moved slowly (therefore taking a long time to reach the edge of the table). This was not trivial: during my first attempts to decrease the average speed of my chips, my AI often chose to voluntarily expel the fastest ones on the table, which I really wish it would have refrained from doing. Nevertheless, it was necessary to try to quickly absorb the neutral droplets early on in the game, therefor not favoring the slower chips at this time. In addition to these two problems, the fact that slowing down my chips past a certain point had decreased the overall effectiveness of my AI because of the shortsightedness facing those that were almost at a standstill, as aforementioned. To somehow overcome this limitation, I spent the number of simulation turns to calculate the score from 7 to 15 which reduced the number of evaluated strategies.

2) The strategy of taking a certain number of my opponents whose validity was limited due of the presence of the edges was suddenly becoming more valuable. This was especially the case for Manwe who admitted that double acceleration from rebounding against an edge, which my AI often did, posed some problems for him. But it was impossible to accelerate like this when there were more edges!

3) The advanced AIs that were anticipating adverse actions were becoming so strong that their predictions were becoming correct more and more often. Therefore, the adverse decisions that were difficult to predict, especially those involving complex rebounds against the edges, were becoming much rarer.

------------------------------




Gamoul (3rd place, PHP, France)


 My first reaction upon seeing the description of the challenge was that it was going to be really complicated. Trajectory calculations, collision and consequence management, rebounds on edges…designing the game system alone would be a real headache.

Ultimately, I had not even considered all of the parameters; for example, I didn’t take into account that there would be a change of direction during the collisions. I couldn’t manage to control the rebounds against the edges until just a few hours before the end, and even then, I had only winged it. And I also couldn’t manage the trajectory of the ejected droplets during the accelerations.


Collisions


At the beginning, I had thought that the main point was to be able to anticipate the collisions. For example, knowing at what time my chips would collide with each other. I had first thought to simulate the environment while moving the chips step by step, but the problem was that having too large of a step would make me miss a collision, and that having a too small of a step would either take too much time or not allow me to see farther.

Then I told myself that by tracing the 2 lines of trajectory of the 2 chips, and by looking at the intersection between the lines and the point on each line where they crossed proportionally to their size, I could probably get somewhere. But that seemed a too complicated to me....

Then I decided that I had to do a little math.

After writing time equations in x and y for each chip, I got the functions of X (t ) = t * x + vx and Y (t ) = t * y + vy.

A collision between 2 chips occurs when the distance between the two centers are either less than or equal to the sum of the radii. The distance is easily calculated from the X and the Y coordinates of the 2 chips, and ultimately, we are left with a quadratic equation according to t. Either it has no solution (no collision), 1 solution (the chips brush past each other), or 2 solutions (a beginning and end collision). Negative solutions indicate divergent trajectories.

Computer algebra software allowed me to get the expressions of solutions (t) based on the positions, speeds, and radii of the two chips.


How to Play?


Now that I could kind of simulate the environment, there was the issue of knowing what to do with it. Did My chips have to set a target, then see if it was reachable? Look for the biggest one that it could absorb the fastest?

Finally, I did what was easiest: I tested.

I watched to see what would happen to each chip whether it did nothing or if it accelerated in all directions.

And herein lies another performance issue…I had to set a shoot angle step in order to stay accurate as well as avoid timeouts. I made a variable step according to the number of my chips and the number of total chips on the table. At first, the step varied between 0.05 and 0.2 radians, but over the course of my improvements, it got worse. I had a step of 0.5 rad at worst and 0.1 at best.

So in the end, on each turn, all of my chips simulated all around them a shot and a wait, and in each of the cases I checked if there was a collision and after how long.

I needed an evaluation method to make my choice. The bigger the target the better the choice (without it being bigger than me), but the time to reach the target had to be the shortest.

I therefore chose to give a score to each tested angle using the following totally empirical method:
-    If the next collision is with a bigger chip (except for mine), the score = -1 / (1+t)
-    If it’s one of my chips, the score is 0.5 * r / (1+t)
-    If it’s a smaller neutral chip, the score is r / (1+t)
-    If it’s a smaller enemy chip, the score is 2 * r / (1+t)

Therefore, a negative score meant that I would eat, and so the longest amount of time the better. But if the target was smaller, I would favor the larger radius, the shortest time, and also the absorption of the enemies, then the neutrals, and then mine.

There were some necessary adjustments to make. For instance, if my chip had already flown by a target, it was not particularly wise to speed up to reach it (unless it was an enemy). It was also necessary to accept the loss of radii at each acceleration and not spend more on acceleration than earned on absorbing the target.

I also adjusted the predictions when I assumed that the bigger enemies would rush towards me, so I increased the speed as if he was just in front me.

Up until now, I had only simulated a single acceleration. I did it as if one could accelerate 2 or 3 times faster (a little less in fact to compensate for an inaccurate simulated trajectory) to try to foresee 3 attack shots in advance. If no collision was found on one shot, I’d look at the next 2 and 3 shots.


Anticipating Collisions With My Targets


Although it took me to around 20th place after a week of playing, the big problem with my strategy was that I was calculating the collision time towards my targets while disregarding other intermediate collisions. So I implemented the following solution:

For each of my targets, I tested the collisions with all the other chips on the table in order to see if my target would have a collision before my chip did. I calculated the consequences of this collision, and that became my new target. But I didn’t update the trajectories.

This was a very approximate method (I was only testing a collision) that was effective in and of itself. I came to realize in the end that the game goes so fast that the changes in target trajectory were not so important.

On the contrary, by improving the level of execution time forced me to rethink my code in depth in order to optimize it. Factoring in the calculations, reducing the calls to the table, removing error outputs…and no more little messages from my chips (although some of the opponents had made me laugh).

After putting all of that into place, I was surprised to find myself in the top 10, although I wasn’t yet preoccupied by the rebounds on the edges.


Rebounds


Tuesday night was the last night of the challenge and of implementing those stupid rebounds that had been causing me problems since the beginning of the game because of the fact that the varying radii for each chip complicated the task.

In fact, if we only had to deal with points, it would have been easier to apply symmetry to the board’s edges and add virtual chips from the outside. We had 9 virtual boards around the real one, but it would only process one vertical and one horizontal rebound at a time.

Speaking of the radii, the most elegant solution would have been to calculate the time equations of each chip’s coordinates while considering the inversions of vx or vy during the rebounds, then limit each equation’s range of values in order to have a set of segments instead of a line…which would have catastrophically increased the amount of collision calculations.

Finally, I handled the smallest problem:
1.    I managed the rebounds only on 1 axis by applying symmetry
2.    I only considered my chip’s rebound (thus taking into account my radius)
3.    I calculated a rebound only if my chip went towards the edge and was less than a certain distance from the edge
So in my collision function, I tested 3 cases: a normal case (no rebound), a case where my chip rebounded on the vertical axis towards what it was headed for, and the same case for the horizontal axis. This multiplied the time of collision calculations by 3 in the case where all the chips would have been close to the edge and headed towards this one.


Final Sprint


I was not happy with my ranking on Wednesday. Despite implementing the rebounds, I could not hold my spot in the top 10. I was a little panicked on Wednesday night as I tried to make things a little better by anticipating the absorptions of my targets by the intermediary chips.

As previously explained, for each reachable target in time t, I would watch to see if it would not be eaten by another chip in time < t. Therefore I had a prediction level, and I had tried to pass at 2. This was catastrophic. With less than 2 hours left for the challenge, I found myself in 50th place full of timeouts and no idea which backup I should use. Should I reuse the one without the rebound management?

Finally, after massively stressing out, I found 2 huge bugs in my version from Tuesday night. I corrected it, I made some adjustments to the constants, and I sent in my last AI with less than an hour before the end. I didn’t even have time to save it.

A few minutes before the end, I had climbed back up to 6th place and was neck in neck with my opponents. Then there was THE bug! I don’t have much to complain about. After the tournament, returning my AI into the arena in a calmer environment probably gave me less timeouts and the ability to finish on the podium.

In conclusion, congratulations to all of the participants and a huge thank you to CodinGame for organizing this type of event. See you soon in the Poker Chip Race Training.

No comments

Post a Comment