This is really fun (for certain definitions of “fun”). A complete reimplementation, based on the published AlphaGo Zero paper. I’ve done this kind of engineering myself in the past when I needed to reproduce the results of someone else’s research. It’s what separates academic literature from, say, a technical white paper or blog post. There must be enough information in the paper to allow independent confirmation.
That’s not to say that DeepMind gave away their Crown Jewels in the paper. Obviously, there’s no such thing as a free lunch:
If you are wondering what the catch is: you still need the network weights. No network weights are in this repository. If you manage to obtain the AlphaGo Zero weights, this program will be about as strong, provided you also obtain a few Tensor Processing Units. Lacking those TPUs, I’d recommend a top of the line GPU – it’s not exactly the same, but the result would still be an engine that is far stronger than the top humans.
And now for the really bad news:
Recomputing the AlphaGo Zero weights will take about 1700 years on commodity hardware…
It just goes to show: AlphaGo Zero (and the original AlphaGo before it) were not brute force solutions per se1. But that’s not to say there wasn’t any brute force involved.
Some of DeepMind’s earlier work didn’t require this amount of computing power, though. It would be interesting to see if any of that has been reproduced in a similar fashion.
- That’s basically impossible with Go. The state space is too large. ↩