NVIDIA has been steadily advancing its AI assistant technology in recent months, and now it's clear just how all the pieces fit together. The company has introduced Omniverse Avatar (for 3D assistant creation) and Riva (custom AI voice creation) platforms that, combined, lead to surprisingly realistic virtual personas with relatively little effort — or, in one case, deliberately unrealistic.
In one demo, used to highlight NVIDIA's AI-powered Maxine toolkit, the company created an Omniverse Avatar from a woman's photo and used Riva to train the voice based on that woman, convert text to speech and translate to different languages. The digital stand-in looks and sounds much like the real person (aside from a couple of stiff-sounding translations), and can even turn its head while maintaining natural-looking eye contact. As you might imagine, this could lead to more relatable virtual helpers at kiosks and websites.
Another demo, for NVIDIA's Project Tokkio "talking kiosk" reference app, shows what could happen when you created a wholly artificial character. The tech showcase centers on a 3D, ray-traced toy version of CEO Jensun Huang (complete with his signature outfit and a Riva-trained voice) using AI to hold a conversation with real people on subjects like climate change and the role of proteins in the body. Various Omniverse systems animate his face and hands. It's not meant to be highly authentic, of course, but it shows how you can craft a 3D virtual assistant considerably more engaging than a disembodied voice.
Most of the Maxine development kit is already available. Riva is usable now in an open beta, and will be free for "small-scale" work. Larger rollouts will depend on a Riva Enterprise program launching early in 2022. You'll have to wait longer for Omniverse Avatar, though. While the basic Omniverse platform is in open beta now, Avatar is only "under development" with no specified launch date. Still, this points to a future where an airport or favorite restaurant can provide an assistant that's (hopefully) useful without seeming too robotic.