Published on August 14, 2018 by

DeepMind’s AlphaGo Zero algorithm beat the best Go player in the world by training entirely by self-play. It played against itself repeatedly, getting better over time with no human gameplay input. AlphaGo Zero was a remarkable moment in AI history, a moment that will always be remembered. Move 37 in particular is worthy of many philosophical debates. You’ll see what I mean and get a technical overview of its neural components (code + animations) in this video. Enjoy!

Code for this video:

Please Subscribe! And like. And comment. That’s what keeps me going.

Want more education? Connect with me here:

There are 2 errors in this video:
1. At the top of the residual network, it says value layer twice. One should say ‘policy’ layer.
2 The residual network is 40 layers, i say 20.

This video is apart of my Machine Learning Journey course:

Join us in the Wizards Slack channel:

Sign up for the next course at The School of AI:

And please support me on Patreon:

Category Tag