Google's new Gemini 1.5 Flash AI model is lighter than Gemini Pro and more accessible

The new model will cut costs for developers using Google's models to build AI applications.


Google announced updates to its Gemini family of AI models at I/O, the company’s annual conference for developers, on Tuesday. It’s rolling out a new model called Gemini 1.5 Flash, which it says is optimized for speed and efficiency.

“[Gemini] 1.5 Flash excels at summarization, chat applications, image and video captioning, data extraction from long documents and tables, and more,” wrote Demis Hassabis, CEO of Google DeepMind, in a blog post. Hassabis added that Google created Gemini 1.5 Flash because developers needed a model that was lighter and less expensive than the Pro version, which Google announced in February. Gemini 1.5 Pro is more efficient and powerful than the company’s original Gemini model announced late last year.

Gemini 1.5 Flash sits between Gemini 1.5 Pro and Gemini 1.5 Nano, Google’s smallest model that runs locally on devices. Despite being lighter weight then Gemini Pro, however, it is just as powerful. Google said that this was achieved through a process called “distillation”, where the most essential knowledge and skills from Gemini 1.5 Pro were transferred to the smaller model. This means that Gemini 1.5 Flash will get the same multimodal capabilities of Pro, as well as its long context window – the amount of data that an AI model can ingest at once – of one million tokens. This, according to Google, means that Gemini 1.5 Flash will be capable of analyzing a 1,500-page document or a codebase with more than 30,000 lines at once.

Gemini 1.5 Flash (or any of these models) aren’t really meant for consumers. Instead, it’s a faster and less expensive way for developers building their own AI products and services using tech designed by Google.

In addition to launching Gemini 1.5 Flash, Google is also upgrading Gemini 1.5 Pro. The company said that it had “enhanced” the model’s abilities to write code, reason and parse audio and images. But the biggest update is yet to come – Google announced it will double the model’s existing context window to two million tokens later this year. That would make it capable of processing two hours of video, 22 hours of audio, more than 60,000 lines of code or more than 1.4 million words at the same time.

Both Gemini 1.5 Flash and Pro are now available in public preview in Google’s AI Studio and Vertex AI. The company also announced today a new version of its Gemma open model, called Gemma 2. But unless you’re a developer or someone who likes to tinker around with building AI apps and services, these updates aren’t really meant for the average consumer.

Catch up on all the news from Google I/O 2024 right here!