Essentially, Deep Mind has developed a framework that'll keep AI from learning how to prevent -- or induce -- human interruption of whatever it's doing. The team responsible for toppling a word Go champion hypothesized a situation where a robot was working in a warehouse, sorting boxes or going outside to bring more boxes in.
The latter is considered more important, so the researchers would give the robot a bigger reward for doing so. But human intervention to prevent damage is needed because it rains pretty frequently here. That alters the task for the robot, making it want to stay out of the rain, and then adopting the human interruption as part of the task rather than being a one-off thing.
"Safe interruptibility can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences, or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not necessarily receive rewards for this," the researchers write.
Deep Mind isn't sure that its interruption mechanisms could be applicable to all algorithms. Specifically? Those related to policy-search robotics (a part of machine learning), so it sounds like there's still a ways to go before the kill switch can be implemented across the board. Sleep tight.