Google wants devices to know when you're paying attention

All in the name of more intuitive experiences.


Google has been working on a "new interaction language" for years, and today it's sharing a peek at what it's developed so far. The company is showcasing a set of movements it's defined in its new interaction language in the first episode of a new series called In the lab with Google ATAP. That acronym stands for Advanced Technology and Projects, and it's Google's more-experimental division that the company calls its "hardware invention studio."

The idea behind this "interaction language" is that the machines around us could be more intuitive and perceptive of our desire to interact with them by better understanding our nonverbal cues. "The devices that surround us... should feel like a best friend," senior interaction designer at ATAP Lauren Bedal told Engadget. "They should have social grace."

Specifically (so far, anyway), ATAP is analyzing our movements (as opposed to vocal tones or facial expressions) to see if we're ready to engage, so devices know when to remain in the background instead of bombarding us with information. The team used the company's Soli radar sensor to detect the proximity, direction and pathways of people around it. Then, it parsed that data to determine if someone is glancing at, passing, approaching or turning towards the sensor.

Google formalized this set of four movements, calling them Approach, Glance, Turn and Pass. These actions can be used as triggers for commands or reactions on things like smart displays or other types of ambient computers. If this sounds familiar, it's because some of these gestures already work on existing Soli-enabled devices. The Pixel 4, for example, had a feature called Motion Sense that will snooze alarms when you wave at it, or wake the phone if it detected your hand coming towards it. Google's Nest Hub Max used its camera to see when you've raised your open palm, and will pause your media playback in response.

Approach feels similar to existing implementations. It allows devices to tell when you (or a body part) are getting closer, so they can bring up information you might be near enough to see. Like the Pixel 4, the Nest Hub uses a similar approach when it knows you're close by, pulling up your upcoming appointments or reminders. It'll also show touch commands on a countdown screen if you're near, and switch to larger, easy-to-read font when you're further away.

While Glance may seem like it overlaps with Approach, Bedal explained that it can be for understanding where a person's attention is when they're using multiple devices. "Say you're on a phone call with someone and you happen to glance at another device in the house," she said. "Since we know you may have your attention on another device, we can offer a suggestion to maybe transfer your conversation to a video call." Glance can also be used to quickly display a snippet of information.

An animation showing how Google's proposed interaction language works. This is an example of the Glance action, where a man looks at a display to his right, and its screen shows a black square reacting in response.

What's less familiar are Turn and Pass. "With turning towards and away, we can allow devices to help automate repetitive or mundane tasks," Bedal said. It can be used to determine when you're ready for the next step in a multi-stage process, like following an onscreen recipe, or something repetitive, like starting and stopping a video. Pass, meanwhile, tells the device you're not ready to engage.

It's clear that Approach, Pass, Turn and Glance build on what Google's implemented in bits and pieces into its products over the years. But the ATAP team also played with combining some of these actions, like passing and glancing or approaching and glancing, which is something we've yet to see much of in the real world.

For all this to work well, Google's sensors and algorithms need to be incredibly adept not only at recognizing when you're making a specific action, but also when you're not. Inaccurate gesture recognition can turn an experience that's meant to be helpful into one that's incredibly frustrating.

ATAP's head of design Leonardo Giusti said "That's the biggest challenge we have with these signals." He said that with devices that are plugged in, there is more power available to run more complex algorithms than on a mobile device. Part of the effort to make the system more accurate is collecting more data to train machine learning algorithms on, including the correct actions as well as similar but incorrect ones (so they also learn what not to accept).

An animation showing one of Google's movements in its new interaction language. The example in this animation is

"The other approach to mitigate this risk is through UX design," Giusti said. He explained that the system can offer a suggestion rather than trigger a completely automated response, to allow users to confirm the right input rather than act on a potentially inaccurate gesture.

Still, it's not like we're going to be frustrated by Google devices misinterpreting these four movements of ours in the immediate future. Bedal pointed out "What we're working on is purely research. We're not focusing on product integration." And to be clear, Google is sharing this look at the interaction language as part of a video series it's publishing. Later episodes of In the lab with ATAP will cover other topics beyond this new language, and Giusti said it's meant to "give people an inside look into some of the research that we are exploring."

But it's easy to see how this new language can eventually find its way into the many things Google makes. The company's been talking about its vision for a world of "ambient computing" for years, where it envisions various sensors and devices embedded into the many surfaces around us, ready to anticipate and respond to our every need. For a world like that to not feel intrusive or invasive, there are many issues to sort out (protecting user privacy chief among them). Having machines that know when to stay away and when to help is part of that challenge.

Bedal, who's also a professional choreographer, said "We believe that these movements are really hinting to a future way of interacting with computers that feels invisible by leveraging the natural ways that we move."

She added, "By doing so, we can do less and computers can... operate in the background, only helping us in the right moments."