As artificial intelligence (AI), now a crucial part of our existence, continues to develop and evolve, its capabilities become increasingly impressive, and it has now learned to play a beloved video game – Minecraft – without humans showing it how to play.
Specifically, the Dreamer AI system of Google’s DeepMind has figured out on its own, with zero human input, how to collect diamonds in Minecraft by ‘imagining’ the future impact of possible decisions, according to a report by Scientific American published on April 4.
In the words of Dreamer’s creators, the system is a step towards machines that can generalize knowledge learned in one domain to new situations, one of the major goals of AI. As Danijar Hafner, a computer scientist at Google DeepMind in San Francisco, California, explained:
“Dreamer marks a significant step towards general AI systems. (…) It allows AI to understand its physical environment and also to self-improve over time, without a human having to tell it exactly what to do.
AI plays Minecraft on its own
What makes this accomplishment all the more impressive is the fact that no two experiences in Minecraft, a virtual 3D world with a variety of terrains, building resources, and collectibles, are the same, and collecting a diamond requires following a series of complex steps.
For instance, you need to find trees, break them down to gather wood, and use it to build a crafting table. Using more wood, you make a wooden pickaxe, and so on, until you have the right tools to collect a diamond buried underground. As Hafner said:
“You have to really understand what’s in front of you; you can’t just memorize a specific strategy. (…) There’s a long chain of these milestones, and so, it requires very deep exploration.”
How AI learned to play Minecraft
Dreamer has learned to play Minecraft by creating a model of its surroundings and using this ‘world model’ to ‘imagine’ future scenarios and guide decision-making, much like how we think in abstract terms.
This allows the AI agent to try things out and predict the potential rewards of different actions with less computation than needed to actually complete those actions in Minecraft.
In the game, the team deployed a protocol to give Dreamer a ‘plus one’ reward every time it completed one of 12 progressive steps involved in collecting a diamond, such as creating planks and a furnace, mining iron, and forging an iron pickaxe.
They would reset the game every 30 minutes so that Dreamer wouldn’t become habituated to one particular configuration but rather learn general rules for gaining rewards. In total, it takes around nine days of continuous play to train Dreamer to find at least one diamond.
Notably, computer scientist Keyon Vafa at Harvard University in Boston, Massachusetts, commented on the breakthrough, arguing that:
“This paper is about training a single algorithm to perform well across diverse reinforcement-learning tasks (…) This is a notoriously hard problem and the results are fantastic.”
Meanwhile, AI is getting so smart in the field of healthcare, that it beats medical professionals in detecting cancer with nearly 100% precision, with major implications for assistance in the diagnosis and treatment of the disease that kills millions each year.