6Google DeepMind Blog·1mo ago

Improved Gemini Audio Models for Powerful Voice Experiences

DeepMind has announced improved Gemini audio models targeting enhanced voice experience capabilities. The announcement comes from the official DeepMind blog, indicating a formal product or capability update to the Gemini model family's audio processing and generation features. Specific technical details were not available in the body text, but the framing suggests advances in speech understanding, synthesis, or real-time voice interaction. This is part of Google DeepMind's ongoing development of multimodal Gemini capabilities.

Frontier Model Releases Multimodal Progress Gemini Audio Google DeepMind Gemini

Related guides (3)

Google DeepMind

Google DeepMind: The Lab Behind Gemini, AlphaFold, and Frontier AI

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Related events (8)

7Google Deepmind Blog·1mo ago·source ↗

Advanced audio dialog and generation with Gemini 2.5

Google DeepMind has announced new audio dialog and generation capabilities in Gemini 2.5. The update extends the model's multimodal capabilities into AI-powered audio interaction and synthesis. No further technical details are provided in the announcement body.

Frontier Model Releases Multimodal Progress Gemini 2.5 Google DeepMind

6Google Deepmind Blog·1mo ago·source ↗

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

DeepMind has released Gemini 3.1 Flash Live, a new voice model designed for real-time audio interactions. The model features improved precision and lower latency compared to its predecessor, aiming to make voice-based AI interactions more fluid and natural. The announcement comes from DeepMind's official blog, indicating a production-grade release.

Frontier Model Releases Inference Economics Google DeepMind Gemini 3.1 Flash Live +1 more

6Google Deepmind Blog·1mo ago·source ↗

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

DeepMind has released Gemini 3.1 Flash TTS, a new audio model focused on expressive speech generation. The model introduces granular audio tags that allow developers precise control over AI speech output. This represents an incremental advancement in Google's text-to-speech capabilities within the Gemini model family.

Frontier Model Releases Multimodal Progress Gemini 3.1 Flash TTS Google DeepMind Gemini

9Google Deepmind Blog·1mo ago·source ↗

A new era of intelligence with Gemini 3

DeepMind has published a blog post titled 'A new era of intelligence with Gemini 3,' suggesting a major new model release or announcement in the Gemini series. The body content was not provided, but the title and source indicate this is a flagship model announcement from Google DeepMind. This would represent the next generation of the Gemini model family following Gemini 2.x.

Long Context Evolution Frontier Model Releases Google DeepMind Gemini +1 more

6Google Deepmind Blog·11d ago·source ↗

Google DeepMind launches Gemini 3.5 Live Translate for real-time voice translation

Google DeepMind has released Gemini 3.5 Live Translate, a near real-time speech translation capability powered by Gemini 3.5. The feature is being deployed across Google AI Studio, Google Translate, and Google Meet. This represents a multimodal capability expansion of the Gemini model family into live audio translation at production scale.

Frontier Model Releases Multimodal Progress Google AI Studio Google Meet Google Translate +2 more

7Hacker News·1mo ago·source ↗

Gemini Omni Model Announced by Google DeepMind

Google DeepMind has published a page for 'Gemini Omni,' a new model in the Gemini family. The announcement appears on DeepMind's official models page, suggesting a new multimodal or omni-capable variant. Limited detail is available from the source, but the HN community engagement (190 points, 87 comments) indicates notable interest.

Frontier Model Releases Multimodal Progress Gemini Omni Google DeepMind Gemini

8Google Deepmind Blog·1mo ago·source ↗

Gemini 2.5: Updates to our family of thinking models

Google DeepMind has announced updates to the Gemini 2.5 model family, including Gemini 2.5 Pro reaching stable status, Gemini 2.5 Flash becoming generally available, and a new Gemini 2.5 Flash-Lite entering preview. These releases mark the maturation of DeepMind's 'thinking model' line with enhanced performance and accuracy. The updates span multiple tiers of the Gemini 2.5 family, from the flagship Pro to the lightweight Flash-Lite variant.

Long Context Evolution Frontier Model Releases Gemini-2.5-Flash-Lite Google DeepMind Gemini-2.5-Pro +1 more

6Google Deepmind Blog·1mo ago·source ↗

Image Editing in Gemini Gets Major Upgrade

Google DeepMind has announced a significant upgrade to native image editing capabilities within the Gemini app. The update enables new ways to transform images directly through the Gemini interface. The blog post is light on technical specifics but signals continued multimodal capability expansion for the Gemini product line.

Frontier Model Releases Multimodal Progress Google DeepMind Gemini App Gemini