Four of the best professional poker players in the world – Dong Kim, Jason Les, Jimmy Chou, and Daniel McAulay – recently got beat by Libratus, a poker-playing AI developed at the Pittsburgh Supercomputing Center. During a period of 20 days of continuous play (10h/day), each of these four professionals lost to Libratus heads-up in a whopping total of 120.000 hands of No Limit Texas Hold-em Poker.
A player may face 10 to the power of 160 different situations in Texas Hold-em Poker: more than the number of atoms in the universe. It took extensive machine learning to compute and prioritize the computation of the most rewarding actions in these situations. Libratus works by running extensive simulations, taking into account the way the professionals play, and figuring out the best counter strategy. Although it is not without flaws, any “holes” the players found in Libratus’ strategy could not be exploited for long, as the algorithm would quickly learn and adapt to prevent further exploitation. The experience was completely different from playing a human player, the professionals argue, as Libratus would make both tiny and huge bets and would continuously change its strategy and plays.
The video below provides more detailed information and also shows the million-dollar margin by which Libratus won at the end of the twenty day poker (training) marathon:
Seth Bling calls himself a video game designer, a hacker and an engineer. You might know him from MarI/O: his neural network that got extremely good to at playing Super Mario Bros. The video below shows the genetic approach Seth used to train this neural network. Seth randomly generated a starting population of neural networks where the inputs – the current frame in the Mario video game – were randomly connected to the outputs – the eight buttons to press (jump, duck, up, down, right, left, etc). By giving the neural nets that made it furthest into the game a larger chance to pass on their genes (their input-output relations) to the next generation with slight mutations, Seth automatically generated neural networks that were more and more proficient in completing the game. In short, by evolution, Seth’s neural network learned the most effective response to the changing video game environment.
After MarI/O, Seth this week posted his newest creation: MariFlow. Here, Seth trained a neural network on 15 hours of training data, consisting of Seth himself playing Super Mario Kart. The neural network thus learned what buttons (output) Seth would most likely push when he encountered a certain Mario Kart parcours piece (input). However, due to random chance, the neural net would often get itself stuck in situations that Seth had not encountered in his training sessions (e.g., reversed, against a wall). The neural net would fail miserably in such situations because it had not learned how to behave. Accordingly, Seth had to generate new training data for these situations and he did so using Human-Computer Interactions in Machine Learning: Seth and the neural net would play alternatively for a while, thus generating training data for situations that Seth would not have encountered on its own. After the neural net was trained with these additional data, it became quite proficient in playing Mario Kart (like Seth) often even winning matches! If you want to know more, you can read the manual here or watch Seth’s video below. If you want to replicate or just play with the data, Seth made everything available here.
Seth has active YouTube, Twitch and Twitter channels and I recommend you check them out!