8Google DeepMind Blog·1mo ago

Google DeepMind Introduces Veo 3, Imagen 4, and Flow Filmmaking Tool

Google DeepMind has announced Veo 3 and Imagen 4, new generative video and image models respectively, alongside a filmmaking tool called Flow. The announcement comes from DeepMind's official blog and represents the next generation of their generative media capabilities. These releases expand Google's multimodal generative AI portfolio targeting creative and professional media production use cases.

Frontier Model Releases Agent and Tool Ecosystem Multimodal Progress Imagen 4 Veo 3.1 Flow Google DeepMind

Related guides (4)

Google DeepMind

Google DeepMind: Frontier AI Across Models, Robotics, and Scientific Discovery

Read asIn-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Related events (8)

6Google Deepmind Blog·1mo ago·source ↗

Introducing Veo 3.1 and Advanced Creative Capabilities

Google DeepMind has announced Veo 3.1, an updated version of its video generation model, with significant enhancements to creative control features. The announcement comes from DeepMind's official blog, indicating a formal product update rather than a research preview. Specific capability details are not provided in the body text, but the framing suggests improvements to user-facing generation controls.

Frontier Model Releases Multimodal Progress Veo Veo 3.1 Google DeepMind

6Google Deepmind Blog·1mo ago·source ↗

Veo 3.1 Ingredients to Video: More consistency, creativity and control

Google DeepMind has released Veo 3.1, an updated video generation model that improves consistency, creativity, and control in generated clips. The update produces more natural and dynamic video content and adds support for vertical video generation. The announcement comes from DeepMind's official blog as a tier-1 source.

Frontier Model Releases Multimodal Progress Veo Veo 3.1 Google DeepMind

7Google Deepmind Blog·1mo ago·source ↗

Veo 2 Video Generation Launches in Gemini Advanced and Whisk Animate

Google DeepMind is rolling out Veo 2 video generation capabilities to Gemini Advanced and Whisk, enabling users to create high-resolution eight-second videos from text prompts or animate still images. Gemini Advanced subscribers can generate videos directly from text, while Whisk Animate converts input images into short animated clips. This marks a consumer-facing deployment of Veo 2, DeepMind's second-generation video generation model.

Frontier Model Releases Enterprise Deployment Patterns Gemini Advanced Veo 2 Whisk +3 more

5Google Deepmind Blog·1mo ago·source ↗

Behind "ANCESTRA": combining Veo with live-action filmmaking

Google DeepMind partnered with filmmaker Darren Aronofsky and director Eliza McNitt, along with a crew of over 200 people, to produce a film called ANCESTRA that integrates Veo video generation with live-action filmmaking. The project represents a high-profile creative application of DeepMind's Veo video model in professional cinematic production. This serves as a capability demonstration of Veo in a real-world, large-scale filmmaking context.

Enterprise Deployment Patterns Multimodal Progress ANCESTRA Veo Darren Aronofsky +2 more

7The Batch·18d ago·source ↗

Grok Imagine 1.0 Sharply Cuts Costs for High-Quality Video Generation

xAI launched Grok Imagine 1.0, a text-and-image-to-video model that topped the Artificial Analysis Video Arena leaderboard in both text-to-video and image-to-video categories at launch. The model generates up to 15-second clips with audio at $4.20 per minute of output, significantly undercutting Google Veo 3.1 ($12/min) and OpenAI Sora 2 Pro ($30/min). It is integrated with the X social network, enabling direct generation and sharing, though xAI disclosed no technical details about the model's architecture. The launch highlights continued rapid cost compression in video generation, with a seven-fold price gap between Grok Imagine 1.0 and Sora 2 Pro.

Frontier Model Releases Evaluation and Benchmarking Artificial Analysis Grok Imagine Google Veo 3.1 +10 more

8Openai Blog·1mo ago·source ↗

Introducing 4o Image Generation

OpenAI has integrated a native image generation capability directly into GPT-4o, positioning it as a primary model capability rather than a separate system. The announcement frames this as their most advanced image generator to date, emphasizing both aesthetic quality and practical utility. This represents a shift toward unified multimodal models that generate images natively rather than relying on separate diffusion-based pipelines.

Frontier Model Releases Inference Economics GPT-4o GPT-4o Image Generation OpenAI +1 more

8Google Deepmind Blog·1mo ago·source ↗

Gemma 4: Google DeepMind Releases Most Capable Open Models

Google DeepMind has released Gemma 4, described as their most capable open models to date. The models are purpose-built for advanced reasoning and agentic workflows, and are positioned as the most capable open models byte-for-byte. The announcement comes from DeepMind's official blog, indicating a significant open-weights release targeting the frontier open model space.

Frontier Model Releases Open Weights Progress Google DeepMind Gemma 4 +1 more

7Google Deepmind Blog·11d ago·source ↗

Google DeepMind releases Gemma 4 12B, a unified encoder-free multimodal open model

Google DeepMind has released Gemma 4 12B, a new open-weights multimodal model that uses a unified, encoder-free architecture. The model is positioned as a capable multimodal system at the 12B parameter scale. This is notable as an open-weights release from a frontier lab with an architectural distinction — eliminating the separate vision encoder common in most multimodal models.

Frontier Model Releases Open Weights Progress Google Google DeepMind Gemma 4 +1 more