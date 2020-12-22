In 2016, Alphabet's DeepMind came out with AlphaGo, an AI which consistently beat the best human Go players. One year later, the subsidiary went on to refine its work, creating AlphaGo Zero. Where its predecessor learned to play Go by observing amateur and professional matches, AlphaGo Zero mastered the ancient game by simply playing against itself. DeepMind then created AlphaZero, which could play Go, chess and shogi with a single algorithm. What tied all those AIs together is that they knew the rules of the games they had to master going into their training. DeepMind's latest AI, MuZero, didn't need to be told the rules of go, chess, shogi and a suite of Atari games to master them. Instead, it learned them all on its own and is just as capable or better at them than any of DeepMind's previous algorithms.
Creating an algorithm that can adapt to a situation where it doesn't know all the rules governing a simulation, but it can still find a way to plan for success has been a challenge AI researchers have been trying to solve for a while. DeepMind has consistently attempted to tackle the problem using an approach called lookahead search. With this method, an algorithm will consider future states to plan a course of action. The best way to wrap your head around this is to think about how you would play a strategy game like chess or Starcraft II. Before making a move, you'll consider how your opponent will react and try to plan accordingly. In much the same way, an AI that utilizes the lookahead method will try to plan several moves in advance. Even with a game as relatively straightforward as chess, it's impossible to consider every possible future state, so instead an AI will prioritize the ones that are most likely to win the match.