Google AlphaGo Zero prospers without human training input

DeepMind, part of Google, has been honing its efforts in creating championship-level games playing AIs. If you remember back to early last year the AlphaGo system surpassed all previous attempts at an AI purposed for the game of Go, playing and winning against accomplished human players for the first time. Now DeepMind has detailed an even better AI, one that was trained without human training inputs. In other words – it learned to play Go on its own, from scratch.

“Previous versions of AlphaGo initially trained on thousands of human amateur and professional games to learn how to play Go. AlphaGo Zero skips this step and learns to play simply by playing games against itself, starting from completely random play,” explains the DeepMind blog. “In doing so, it quickly surpassed human level of play and defeated the previously published champion-defeating version of AlphaGo by 100 games to 0.”

A novel form of reinforcement learning makes AlphaGo Zero its own teacher. To begin, the neural network knows nothing about the game of Go but continuously plays itself, following the rules of course, and with each iteration it improves. The neural network learning is combined with a “powerful search algorithm” to make an even stronger competitor. Overall this way of learning makes the AI less constrained by the limits of human knowledge.

An interesting effect from the new approach is far greater efficiency. Previously all that ‘humanity’ in the learning process must have just been too-much-information. To get where it is now, or rather over the 40 days charted, AlphaGo Zero has accumulated knowledge over millions of games. When monitoring the game playing, humans observed new knowledge, unconventional strategies, and creative new moves in play.

DeepMind believes that AlphaGo Zero is “a critical step towards” AI being a multiplier of human ingenuity. Furthermore, the developers are now more confident than ever that AI systems can be transplanted to many situations and have the potential to solve some of the most important challenges humans are now facing.