An interesting criticism of Deep Mind’s recent AlphaZero paper:
However, there are reasonable doubts about the validity of the overarching claims that arise from a careful reading of AlphaZero’s paper. Some of these concerns may not be considered as important by themselves and may be explained by the authors. Nevertheless, all the concerns added together cast reasonable doubts about the current scientific validity of the main claims.
I’m not sure I agree with everything written here, but there are some good points.
First of all: reproducibility. But that’s a problem with most published work in computer science. No one puts their source code on github so others can reproduce their results. If you’ve ever seen or worked with source code produced for academic purposes you can probably guess why that is. Someone did manage to write an open source implementation of AlphaGo Zero based on that paper, so it is possible. Unfortunately it also requires massive amounts of computational power to train the neural network. “4 hours of training on 5000 TPUs” is well outside of the capabilities of anyone outside of Google. That shows that the source code alone is not enough.
Secondly: It definitely seems as though the tables were heavily tilted towards AlphaZero for the 100 chess games presented as results. It seems that Stockfish was given some pretty unfair disadvantages here. Hopefully that was just a logistical mistake, and there will be a rematch. If the peer review works as it should, either a disclaimer should be added to these results or they should be redone in their entirety.
Speaking of which, something I probably should have noted in my last post about AlphaZero: the paper hasn’t been peer reviewed yet. In scientific terms that means it isn’t a publication. It’s really more of a whitepaper. That’s not to say it won’t become a publication (because it almost certainly will), but it isn’t one yet, and what I linked to might not be the same as what is eventually published.