At Google I/O 2025, Google unveils the Veo 3 video generation model and the Imagen 4 image generation model

Google presented the next generation of its image and video artificial intelligence (AI) models on Tuesday at the I/O 2025 event. These multimodal AI models, dubbed Imagen 4 and Veo 3, provide new features and updates to their predecessors. While Imagen 4 has quicker generating speeds and enhanced text rendering, Veo 3 now has native audio creation capabilities and can include background sound and conversation in created films. Along with the new models, the tech giant launched Flow, a new AI-powered filmmaking program.

What's New with Imagen 4 and Veo 3?
In a blog post, the Mountain View-based tech behemoth described its new picture and video generating AI models. Imagen 4 arrives over a year after its predecessor was launched. Google also published Veo 2 and added additional features to Imagen 3 in December 2024.

Imagen 4 focuses on model creation speed and accuracy. Like the previous generation, the current Imagen model accepts text and pictures as input. The produced photographs show an improvement in terms of small details such as delicate textiles, water drops, and animal hair. It can also produce photos considerably more quickly than its predecessor.

Google claims that Imagen 4 can produce superior photographs in photorealism and abstract genres. It produces output in a variety of aspect ratios and resolutions, including up to 2K. In addition, the business improved text display by focusing on word spelling and typography. The model is now more mindful of context when it comes to text positioning, font size selection, and innovative font style choices.

Imagen 4 is now accessible in the Gemini app, Whisk, Vertex AI (for companies), and Workspace applications including Docs, Slides, Vids, and more. It is unclear whether Google intends to spread the approach to all Gemini users or only premium members. Later this year, the business hopes to release a version of the AI model that can create photos 10 times quicker than Imagen 3.

Google's latest video creation model, Veo 3, now includes native audio generation, allowing it to add ambient noises, background noise, and conversations into films. In a demo demonstrated at the I/O 2025 event, two animated characters could communicate with each other in a crisp and natural-sounding voice.

Aside from this, Veo 3 improves timely adherence, real-world physics, and realistic lip synchronization. It is presently available to Google AI Ultra members in the United States through the Gemini app and a newly launched app called Flow. Enterprises may access it through the Vertex AI platform.

Flow is an AI-powered filmmaking tool that uses the Gemini, Imagen, and Veo models. Users may describe a video clip using natural language prompts, and the software will create an eight-second video. The software is reported to have high prompt adherence and can provide consistent frames of cast, places, objects, and styles. It is accessible for Google AI Pro and Ultra plan members in the United States.

At Google I/O 2025, Google unveils the Veo 3 video generation model and the Imagen 4 image generation model

Post a Comment

Google DeepMind unveiled Gemini Robotics 1.5 AI models to power general-purpose robots

Made with Love by

#buttons=(Ok, Go it!) #days=(20)

Contact form

At Google I/O 2025, Google unveils the Veo 3 video generation model and the Imagen 4 image generation model

You Might Like

Post a Comment

#buttons=(Ok, Go it!) #days=(20)

Contact form