The system is powered by two neural networks. The first takes a camera image and determines objects' spatial position in relation to the robot -- but it was trained only with a host of simulated images, meaning it was taught how to interact with the real world before it ever actually saw the real world. The second imitates tasks shown by the demonstrator by scanning through recorded action and paying attention to frames that tell it what to do next.
This training model is only a prototype, but teaching robots entirely in simulation could allow researchers to train them for complex tasks without needing physical elements at all. That would let humans safely and easily approximate extreme environments like arctic waters or areas soaked in nuclear radiation -- or even other planets.