Google unveils Veo and Imagen 3, its latest AI media creation models

It's also targeting recording artists with Music AI Sandbox.

It's all AI all the time at Google I/O! Today, Google announced its new AI media creation engines: Veo, which can produce "high-quality" 1080p videos; and Imagen 3, its latest text-to-image framework. Neither sound particularly revolutionary, but they're a way for Google to keep up the fight against OpenAI's Sora video model and Dall-E 3, a tool that has practically become synonymous with AI-generated images.

Google claims Veo has "an advanced understanding of natural language and visual semantics" to create whatever video you have in mind. The AI generated videos can last "beyond a minute." Veo is also capable of understanding cinematic and visual techniques, like the concept of a timelapse. But really, that should be table stakes for an AI video generation model, right?

To prove that Veo isn't out to steal artist's jobs, Google has also partnered with Donald Glover and Gilga, his creative studio, to show off the model's capabilities. In a very brief promotional video, we see Glover and crew using text to create video of a convertible arriving at a European home, and a sailboat gliding through the ocean. According to Google, Veo can simulate real-world physics better than its previous models, and it's also improved how it renders high-definition footage.

"Everybody's going to become a director, and everybody should be a director," Glover says in the video, absolutely earning his Google paycheck. "At the heart of all of this is just storytelling. The closer we are to be able to tell each other our stories, the more we'll understand each other."

It remains to be seen if anyone will actually want to watch AI generated video, outside of the morbid curiosity of seeing a machine attempt to algorithmically recreate the work of human artists. But that's not stopping Google or OpenAI from promoting these tools and hoping they'll be useful (or at least, make a bunch of money). Veo will be available inside of Google's VideoFX tool today for some creators, and the company says it'll also be coming to YouTube Shorts and other products. If Veo does end up becoming a built-in part of YouTube Shorts, that's at least one feature Google can lord over TikTok.

Google IO 2024

As for Imagen 3, Google is making the usual promises: It's said to be the company's "highest quality" text-to-image model, with "incredible level of detail" for "photorealistic, lifelike images" and fewer artifacts. The real test, of course, will be to see how it handles prompts compared to Dall-E 3. Imagen 3 handles text better than before, Google says, and it's also smarter about handling details from long prompts.

Google is also working with recording artists like Wyclef Jean and Bjorn to test out its Music AI Sandbox, a set of tools that can help with song and beat creation. We only saw a brief glimpse of this, but it's led to a few intriguing demos:

The sun rises and sets. We're all slowly dying. And AI is getting smarter by the day. That seems to be the big takeaway from Google's latest media creation tools. Of course they're getting better! Google is pouring billions into making the dream of AI a reality, all in a bid to own the next great leap for computing. Will any of this actually make our lives better? Will they ever be able to generate art with genuine soul? Check back at Google I/O every year until AGI actually appears, or our civilization collapses.

Catch up on all the news from Google I/O 2024 right here!