In a big step in the development of computers capable of independent reasoning, a team of Google researchers has created a machine that can figure out how to play and win video games. The breakthrough is similar to IBM’s chess-winning Deep Blue computer, or Watson, a computer that IBM programmed to win at Jeopardy! In some ways the latest advance is more profound, however; Deep Blue and Watson were taught strategies that worked inside the rules of their respective games. The new program learned—without training —how to succeed at a range of different games.
The rules-learning technology, initially described in the British journal Nature last week, could initially be important to Google for things like search, voice recognition and language translation. Further out, the researchers said, the learning system could be useful for robots and driverless cars. Demis Hassabis, who led the project, described the program as “a single general learning system,” combining sensory knowledge and learning. That dual capability, which the authors of the paper call “deep-Q network,” or DQN, “is mastering and understanding structure,” he said.
The Google programme successfully taught itself the rules to over 49 Atari 2600 computer games from the 1980s, eventually figuring out strategies for victory. This included figuring out navigation, actions and positive outcomes, then using these for improved outcomes. In 43 of the games, which included such classics as Space Invaders and Breakout, the DQN outperformed previous computational efforts to win, the paper said. In more than half the games, the new system could eventually play at least three-quarters as well as a professional human games tester.
By figuring out for itself the rules of a system, the DQN occasionally surprised its creators with new winning strategies. Playing Seaquest, for example, the computer determined that the game’s submarine could survive by staying near the surface for the entire game. In Breakout, it figured out a novel way to get through a wall of bricks.
Hassabis is a founder of DeepMind, an artificial intelligence company that Google bought last year for a reported $400 million. He is also a champion gamer and well-regarded games designer. But the new findings also underline how far artificial intelligence research is from developing human-type intelligence. DQN is not capable of developing conceptual knowledge, like what a submarine actually is, or transferring what it learned from one game to excel at another.
We humans, Hassabis noted, “have prior knowledge that we bring from the real world.” To master abstractions or conceptual thought, he said, “we suspect it will require some new kind of algorithm.” That advance is likely years away. The DQN team is now focused on bringing its machine into a 1990’s capability, learning and navigating the far more difficult three-dimensional spaces of driving games like Grand Theft Auto.