NVIDIA's new AI turns videos of the real world into virtual landscapes

It all started with cities and -- what else? -- Gangnam Style.

Senior Editor, Mobile

Updated Mon, Dec 3, 2018, 8:00 AM·5 min read

Attendees of this year's NeurIPS AI conference in Montreal can spend a few moments driving through a virtual city, courtesy of NVIDIA. While that normally wouldn't be much to get worked up over, the simulation is fascinating because of what made it possible. With the help of some clever machine learning techniques and a handy supercomputer, NVIDIA has cooked up a way for AI to chew on existing videos and use the objects and scenery found within them to build interactive environments.

NVIDIA's research here isn't just a significant technical achievement; it also stands to make it easier for artists and developers to craft lifelike virtual worlds. Instead of having to meticulously design objects and people to fill a space polygon by polygon, they can use existing machine learning tools to roughly define those entities and let NVIDIA's neural network fill in the rest.

"Neural networks — specifically generative models — will change how graphics are created," Bryan Catanzano, NVIDIA's vice president of applied deep learning, said in a statement. "This will enable developers, particularly in gaming and automotive, to create scenes at a fraction of the traditional cost."

Here's how it works. Catanzano told reporters that researchers trained the fledgling neural model with dashcam videos taken from self-driving car trials in cities for about a week on one of the company's DGX-1 supercomputers. (NVIDIA CEO Jensen Huang once called the DGX-1 the equivalent of "250 servers in a box," so pulling off a similar feat at home seems all but impossible.)

Meanwhile, the research team used Unreal Engine 4 to create what they called a "semantic map" of a scene, which essentially assigns every pixel on-screen a label. Some pixels got lumped into the "car" bucket, others into the "trees" category, or "buildings" — you get it. Those clumps of pixels were also given clearly defined edges, so Unreal Engine ultimately produced a sort of "sketch" of a scene that got fed to NVIDIA's neural model. From there, the AI applied the visuals for what it knew a "car" looked like to the clump of pixels labeled "car" and repeated the same process for every other classified object in the scene. That might sound tedious, but the whole thing happened faster than you might think — Catanzaro said the car simulation ran at 25 frames-per-second and that the AI rendered everything in real time.

NVIDIA's team also used this new video-to-video synthesis technique to digitally coax a team member into dancing like PSY. Crafting this model took the same kind of work as the car simulation, only this time the AI was tasked with figuring out the dancer's poses, turning them into rudimentary stick figures and rendering another person's appearance on top of them.

For now, the company's results speak for themselves. They're not as graphically rich or as detailed as a typical scene rendered in a AAA video game, but NVIDIA's sample videos offer glimpses at digital cities filled with objects that do sort of look real. Emphasis on "sort of." The tongue-in-cheek Gangnam Style body swaps worked a little better.

While NVIDIA has open-sourced all of its underlying code, it'll likely be a while yet before developers will start using these tools to flesh out their next VR set pieces. That's just as well, honestly, since the company was quick to point out the neural network's limitations: while the virtual cars zipping around its simulated cityscape looked surprisingly true-to-life, NVIDIA said its model wasn't great at rendering vehicles as they turn because its label maps lacked sufficient information. More troubling to VR artisans is the fact that certain objects, like those pesky cars, might not always look the same as the scene progresses; in particular, NVIDIA said those objects may change color slightly over time. These all are clearly representations of real-world objects, but they're a long way from being photo-realistic for long periods of time.

Those technical shortcomings are one thing; it's sadly not hard to see how these techniques could be used for unsavory purposes, too. Just look at deepfakes: it's getting remarkably difficult to tell these artificially generated videos apart from the real thing, and as NVIDIA proved with its Gangnam Style test, its neural model could be used to create uncomfortable situations for real people. Perhaps unsurprisingly, Catanzaro largely looks on the bright side.

"People really enjoy virtual experiences," he told Engadget. "Most of the time they're used for good things. We're focused on the good applications." Later, though, he conceded people using tools like these for things he doesn't approve of is the "nature of technology" and pointed out that "Stalin was Photoshopping people out of pictures in the '50s before Photoshop even existed."

There's no denying that NVIDIA's research is a notable step forward in digital imaging, and in time it may help change the way we create and interact with virtual worlds. For commerce, for art, for innovation and more, that's a good thing. Even so, the existence of these tools also means the line between real events and fabricated ones will continue to grow more tenuous, and pretty soon we'll have to start really reckoning with what these tools are capable of.

Engadget
Threads now lets you control who can quote your posts
Head of Instagram Adam Mosseri announced the update this weekend, saying he hopes it will “help keep Threads a more positive place.” Users can choose to turn off quotes entirely or limit them only to people they follow.
Engadget
Parrots in captivity seem to enjoy video-chatting with their friends on Messenger
A small study led by researchers at the University of Glasgow and Northeastern University compared parrots’ responses when given the option to video chat with other birds via Messenger versus watching pre-recorded videos.
Engadget
Google prohibits ads promoting websites and apps that generate deepfake porn
Google has updated its Inappropriate Content Policy to include language that expressly prohibits advertisers from promoting websites and services that generate deepfake pornography.
Engadget
X is using Grok to publish AI-generated news summaries
X is using Grok to publish AI-generated summaries of news and other topics that trend on the platform.
Engadget
Nintendo blitzes GitHub with over 8,000 emulator-related DMCA takedowns
Nintendo sent a Digital Millennium Copyright Act (DMCA) notice for over 8,000 GitHub repositories hosting code from the Yuzu Switch emulator, which the Zelda maker previously described as enabling “piracy at a colossal scale.”
Engadget
Helldivers 2 PC players suddenly have to link to a PSN account and they're not being chill about it
Helldivers 2 players have become frustrated after Sony suddenly required linking to a PSN account to play the game on PC. This was not necessary when the title launched in February.
Engadget
Redfall’s two DLC heroes are still MIA a year later
Microsoft may want to be more careful about leaving a trail of broken promises when games don’t go as planned. A year after Redfall landed with a thud, players are still waiting for the advertised post-launch DLC they already paid for.
Engadget
Instagram's 'Add Yours' sticker now lets you share songs
Instagram has released some new interactive stickers for use in Stories. One of these is a “Reveal” sticker that blurs content, so people have to DM the creator to get it unlocked.
Engadget
Apex Legends is getting a solo mode for the first time in five years
Apex Legends will have a solo mode for the first time in five years when the new season starts on May 7. Respawn said as recently as January it had no plans to bring back the single-player option.
Engadget
A four-pack of Samsung's Galaxy SmartTag 2 trackers is back on sale for $70
A bundle of Samsung's Galaxy SmartTag 2 Bluetooth trackers is back on sale for $70, matching the lowest price we've tracked.
Engadget
Research indicates that carbon dioxide removal plans will not be enough to meet Paris treaty goals
New research indicates a large “emissions gap” between what actions nations have committed to help remove carbon from the atmosphere and what’s required to meet Paris treaty goals.
Engadget
Rabbit R1 review: A $199 AI toy that fails at almost everything
The Rabbit R1 is a cute AI gadget, but at launch it’s riddled with issues and terrible battery life. When phones can handle similar AI tasks, the R1 doesn’t do enough to justify its existence.
Engadget
The Apple Watch Series 9 is back on sale for $299, plus the rest of this week's best tech deals
This week, we found deals on gear from Samsung, Apple, Bose, Anker and more.
Engadget
Boeing’s Starliner spacecraft may finally take its first crewed flight next week
Boeing's Starliner crew capsule is scheduled to launch from Cape Canaveral Space Force Station’s Launch Complex-41 in Florida on Monday, May 6. The launch window opens at 10:34PM ET. Astronauts Butch Wilmore and Suni Williams will be on board.
Engadget
Google says Epic’s Play Store demands are too much and too self-serving
Google has filed an injunction telling the court that it will not give Epic what it wants without a fight, because the company's asks "stray far beyond the trial record."
Engadget
The best gifts to upgrade your grad’s tech setup
College grads are probably using the same tech they started out with four years ago. Here are the best gadgets you can get them to upgrade their kit, including laptops, headphones, monitors and more.
Engadget
Boom's XB-1 supersonic jet has been authorized to break the speed of sound
Boom's supersonic XB-1 test jet has received Federal Aviation Administration (FAA) approval to fly past Mach 1.
Engadget
The Morning After: Peloton's grim post-pandemic reality
The biggest news stories this morning: Huawei has been secretly funding research in America after being blacklisted, The best noise-canceling earbuds, Olivia Rodrigo, Drake and other Universal artists return to TikTok.
Engadget
Engadget Podcast: Kill the Rabbit (R1)
The Rabbit R1 is finally here, and it's yet another useless AI gadget.
Engadget
The best password manager for 2024
Remembering dozens of passwords can be difficult for anyone. These are the best password managers you can use to keep your information safe and secure.

NVIDIA's new AI turns videos of the real world into virtual landscapes

It all started with cities and -- what else? -- Gangnam Style.

Latest Stories

Threads now lets you control who can quote your posts

Parrots in captivity seem to enjoy video-chatting with their friends on Messenger

Google prohibits ads promoting websites and apps that generate deepfake porn

X is using Grok to publish AI-generated news summaries

Nintendo blitzes GitHub with over 8,000 emulator-related DMCA takedowns

Helldivers 2 PC players suddenly have to link to a PSN account and they're not being chill about it

Redfall’s two DLC heroes are still MIA a year later

Instagram's 'Add Yours' sticker now lets you share songs

Apex Legends is getting a solo mode for the first time in five years

A four-pack of Samsung's Galaxy SmartTag 2 trackers is back on sale for $70

Research indicates that carbon dioxide removal plans will not be enough to meet Paris treaty goals

Rabbit R1 review: A $199 AI toy that fails at almost everything

The Apple Watch Series 9 is back on sale for $299, plus the rest of this week's best tech deals

Boeing’s Starliner spacecraft may finally take its first crewed flight next week

Google says Epic’s Play Store demands are too much and too self-serving

The best gifts to upgrade your grad’s tech setup

Boom's XB-1 supersonic jet has been authorized to break the speed of sound

The Morning After: Peloton's grim post-pandemic reality

Engadget Podcast: Kill the Rabbit (R1)

The best password manager for 2024

About

Sections

Contribute

Buying Guides