Curiosity did not kill the cat. it was framed

Just curios (again).

Did anybody do anything “interesting” this year?
For example, any attempts at genetic algorithms, genetic programming , reinforcement learning, simulated annealing, other and etc ?

Or was good ol’ graph searching king again (I have a strong suspicion this is the case) ?

Personally, I made a stab at using GA to optimize eval function weightings. Clearly did not pan out super well for me, but considering that (in retrospect) it shouldn’t really have worked at all, I’m pleasantly surprised. Plus the failure to shine is probably due to some other stupidity on my part.

Anyway, not looking for in depth details, just wondering if anybody ventured “off the beaten path”, so to speak.

I generated a list of possible combinations of the following values (speed, damage, block,lane(side or mid), boosts, boostCounters and lizzards). I capped boosts and lizzards at 15.
I then started running hundreds of races from block 1499 and stored the average number of turns it takes to complete the race for a solo car for all the combinations with block set to 1499. You can guess that those values were mostly 1.
I then did it for 1498 down to 1480.
for all the blocks below 1480 I would use tree search to the point I cannot see the terrain anymore using the averages already calculated as fitnesses.
I did this all the way down to 0 and stored all these value in files.
It took 2 weeks of my laptop running 24/7 lol.

This meant that my base fitness function was a giant hash table (about 40 MB in size).

I added bonuses for the pvp oriented powerups like emp afterward using a something like A * log (powerupcount)

1 Like

@LouisLotter thats a very interesting method - did you invent that yourself, or is it based on an existing technique?

I plan to write a blog post about mine with all the details, but in a nutshell:

I look at all possible game states 3 turns ahead, evaluate them, and then play the Minimax move. In other words the move for which the opponent’s best counter has the worst evaluation (for him).

My evaluation function is pretty simple. I thumb-sucked some terms that I thought would be important (speed, lead ahead of opponent, number of powerups in hand) and just take a weighted sum. I thumb-sucked the weights to start, and then I tuned them in probably the simplest way possible: one by one, nudge a weight a little bit, compare with the current champion (by playing 100 games) and if it’s better, it becomes the new champion.

The tuning was super clunky and inefficient, but made a huge difference!

I also put a lot of effort into making sure that my version of the game engine was “bug-for-bug” compatible with the official one. My bot discovered the “truck-then-emp” exploit all on its own, which made my day when I saw it!

1 Like

I think I invented the way it applies to this game but. It’s a general idea in the research papers on these kind of things to try and estimate something like rounds to finish the race.
I was trying to come up with a monte-carlo tree search approach to this game but then I realised that for the part of the track that you have not seen you know as much before the race as during, so you can just as well do the monte-carlo simulations offline. The only problem is that you have to do it for a lot of possible states.

In terms of the tree search itself I did something similar to you, so it’s mainly our evaluation functions that are very different.

1 Like

I would say my golden feature was a complete store of the whole map I could see,

So what I managed to implement (but not really use to its full potential) was the following:

Id store the whole map,
Then i would store the enemy X, Y and Speed
I would use those variables to determine what moves my enemy made
Which in turn helped me track power ups. Knowing when they could boost, Knowing when they could EMP, Knowing when they had Lizards. But… I probably needed 2 more weeks to really hone this.

Then I would simulate my own best moves
Simulate my best move if I do “Nothing” which gave me a score on my Nothing moves.

Then I would simulate what a cybertruck placement would do to my enemy (Potentially force boost break)
Or just drop it on the closest enemy Powerup.

Backfired a lot because I often got EMPd when I was going to pass over boosts and covered it with a Cybertruck.

In terms of what data I had. I had a form where I would run the following 4 sims:

My Best
My Best Nothing
Enemy Best
Enemy Best Cybertruck Interference
Enemy Best Powerup interference

Then I would score all and choose something from there.

But my actual scoring Algorithms seems to be slightly off still.

I had all the data, But I never really used it.
And I have some last moment bugs thats also a bit troublesome.

But yeah, I would say thats my golden feature.
Just a pity I did not manage to actually use it better.
Did some moves where I would counter my enemy.

Would have loved to see what this implementation would have done with a training AI (By example fine tuning my powerup values and running hundreds of matches to find that “near perfect” balance).
But as usual I generally brute force my entries. But this year I lost control of things a bit.

Id say thats the most impressive aspect of my bot. Reason I believe on the right seed my bot would be unstoppable.
But I did not quite get my: when to go for powerup and when to leave quite right.

And sometimes simpler is better

1 Like

I experimented with a quite a few techniques!

Here are two of the notable ones:

  1. Early on I tried to build statistical models (very similar ideas to what Louis reported) and I struggled to get a clear signal. Here are two examples:
  • Given the state as you exit the map (speed, position power ups etc) as features, use a random forest classifier to pick the exit which on average resulted in the shortest game. The random forest classifier was trained using thousands of games and compiled into machine code using a compiler which I wrote to go from SciKitLearn Random Forests -> Lisp -> Assembly (via SBCL).
  • Given those same features at the exit from the current board, use the tens of thousands of random board states to try and predict what was likely to happen two steps into the next board for every move combination.

Neither of these statistical/ML approaches worked and in hindsight I think that it’s because I assumed uncorrelated random boards. i.e. that boards you’ve encountered have no impact on future boards (which greatly reduced the signal strength of my models). It turns out that they are correlated! I know this because I kept building the bot runner from source as I worked on my own and it has tests which verify that the number of muds, walls and power ups never exceeds a specific value and have a particular mean value. I came to a realization about the significance of this much too late though.

  1. Use a GA (differential evolution) to score board states. It was actually super effective for me. Running the optimization would take upwards of a week, but it found good candidates.

I actually wrote about 80 commits worth of work before the end of the tournament and had to trash the lot because they contained some trivial mistakes which I both didn’t spot and I wouldn’t have had enough time to run the optimiser again anyway(!)

Something else worth noting is that I (in true Common Lisp style) implemented a DSL for writing bots for the game. It has some cool features like static checking for various aspects of how it wires itself up, free variable detection and elimination, a built in method for iteration (which optimises tail calls!) and a very efficient way of storing state which means that (if I went far enough -which I didn’t have the time to do-) I could have stored every piece of state unboxed, on the stack and gotten c-like performance(!) I think that just implementing the DSL and the learning I got from that experience made this tournament hugely worthwhile to me this year!

2 Likes

Well…I think as far as “exotic” and “hardcore” goes, you might take the crown @Quiescent