in Link Post

AlphaZero is the New King of Computational Chess


Chess changed forever today. And maybe the rest of the world did, too.

If you didn’t see this coming, you probably should have. I think was essentially preempted by this fantastic interview of Gary Kasparov by DeepMind CEO Demis Hassabis back in June. Listen from around 31:50 to get the full context of the segment I’m talking about, or just skip straight to 35:20 to get to the meat of it. Even better: listen to the whole thing.

The summary is quite high level, but if you want the real gory details you can find them in the paper, “Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm”. Two things from the paper which I think are worth noting specifically1. Firstly, the more complicated input and out representations which are needed by chess as compared to Go:

Chess and shogi are, arguably, less innately suited to AlphaGo’s neural network architectures. The rules are position-dependent (e.g. pawns may move two steps forward from the second rank and promote on the eighth rank) and asymmetric (e.g. pawns only move forward, and castling is different on kingside and queenside). The rules include long-range interactions (e.g. the queen may traverse the board in one move, or checkmate the king from the far side of the board). The action space for chess includes all legal destinations for all of the players’ pieces on the board;

I’ve been coding those rules up for a side project, and I can confirm that they’re a complete pain to represent cleanly.

Secondly, the fact that (like all AlphaGo iterations before it) this isn’t a pure neural network based approach:

…AlphaZero uses a general- purpose Monte-Carlo tree search (MCTS) algorithm.

I’m inclined to suspect that this might be coming next.

  1. On a personal note, it’s nice to have gotten my Machine Learning knowledge to a point that most of the paper actually makes sense to me now.