Entity · model

Gemini 3.1 Flash Live

modelactivegemini-3-1-flash-live-516bf781·6 events·first seen May 19, 2026

Aliases: Gemini 3.1 Flash Live, Gemini 3.1 Flash-Lite, Gemini 3.1 Flash Lite, Gemini-3.1-Flash-Lite

Co-occurring entities

More like this (12)

Gemini 3.5 Flash Gemini 3.1 Flash Live Preview Gemini 3 Flash Gemini 3.5 Flash-Lite Gemini 3.6 Flash Gemini-2.5-Flash-Lite Gemini Flash 3.5 Gemini 3.1 Flash Image Gemini 3.1 Flash TTS Gemini 3.1 Pro Gemini 3.5 Flash Cyber Gemini-3.1-Pro

Recent events (6)

7arXiv · cs.CL·Jun 25, 2026·source ↗

Study finds real-time voice AI systems ignore vocal delivery cues despite perceiving them

A new arXiv paper evaluates four production real-time voice AI systems — OpenAI GPT Realtime 2, Google Gemini 3.1 Flash Live, Qwen3.5 Omni Plus, and Qwen3.5 Omni Flash — on tasks where vocal delivery (distress, fear, sarcasm) carries meaningful information distinct from word content. All four systems consistently act on words alone, ending calls with crying users who deny distress, approving frightened-voice wire transfers, and accepting sarcastic consent. Critically, three of four systems can correctly identify the emotional state when asked directly, revealing a gap between perception and decision-making the authors term the 'emotional intelligence gap.' Prompting systems to attend to vocal delivery improves performance only partially and inconsistently.

Evaluation and Benchmarking AI Safety Research Qwen3.5 Omni Flash GPT-Realtime-2 Google +6 more

6The Batch·Jun 3, 2026·source ↗

Qwen3.5 Small tops mobile-sized open models; GPT-5.3 Instant, Gemini 3.1 Flash-Lite, Claude memory import, and LLM deanonymization research

Alibaba released the Qwen3.5 Small model series (0.8B–9B parameters) with a hybrid Gated Delta Networks + sparse MoE architecture, with the 9B model outperforming OpenAI's gpt-oss-120B on GPQA Diamond despite being 13.5x smaller; all weights are Apache 2.0 licensed. Google introduced Gemini 3.1 Flash-Lite, a cost-optimized model at $0.25/M input tokens with 2.5x faster TTFT than Gemini 2.5 Flash. OpenAI released GPT-5.3 Instant targeting conversational quality improvements and hallucination reduction, while Anthropic added memory import/export functionality across all Claude tiers. Separately, researchers from MATS, Anthropic, and ETH Zurich demonstrated that LLM-based pipelines can deanonymize pseudonymous online users at 68% recall/90% precision for $1–4 per profile.

Frontier Model Releases Open Weights Progress Claude Google Alibaba +14 more

6arXiv · cs.AI·May 25, 2026·source ↗

ETCHR: Decoupled Image Editing for Visual Chain-of-Thought Reasoning in MLLMs

ETCHR introduces a question-conditioned, reasoning-aware image editing model that decouples visual transformation from downstream understanding in multimodal LLMs. It addresses two identified gaps—language-side (mapping abstract questions to visual edits) and generation-side (edit quality degrading with reasoning depth)—via a two-stage training recipe combining supervised fine-tuning on edit trajectories and VLM-derived reward signals. Because the editor is decoupled, it plugs into arbitrary MLLMs without retraining, yielding Pass@1 gains of roughly +4.6 to +5.5 points across five task families when paired with Qwen3-VL-8B, Gemini-3.1-Flash-Lite, and Kimi K2.5. The work advances the 'think with images' paradigm beyond fixed toolkits and unified multimodal approaches.

Agent and Tool Ecosystem Alignment and RLHF Reasoning Enhancement Qwen3-4B ETCHR +5 more

7arXiv · cs.CL·May 22, 2026·source ↗

Boiling the Frog: A Multi-Turn Benchmark for Agentic Safety

Researchers introduce 'Boiling the Frog,' a multi-turn safety benchmark evaluating whether tool-using AI agents in corporate/office settings are susceptible to incremental attacks that begin with benign requests before introducing harmful payloads. The benchmark uses stateful multi-turn evaluation with a three-level operational risk taxonomy grounded in the EU AI Act and its GPAI Code of Practice. Across nine models, aggregate strict attack success rate is 44.4%, ranging from 20.5% for Claude Haiku 4.5 to 92.9% for Gemini 3.1 Flash Lite, with loss-of-control scenarios reaching 93.3% category-level ASR.

Evaluation and Benchmarking AI Safety Research Seed 2.0 Lite Claude Haiku 4.5 EU AI Act +7 more

6Google Deepmind Blog·May 19, 2026·source ↗

Gemini 3.1 Flash-Lite: Built for intelligence at scale

Google DeepMind has released Gemini 3.1 Flash-Lite, described as the fastest and most cost-efficient model in the Gemini 3 series. The announcement positions it as optimized for high-throughput, cost-sensitive deployments at scale. The body is sparse, offering no benchmark details or capability specifics beyond the efficiency framing.

Frontier Model Releases Inference Economics Google DeepMind Gemini 3.1 Flash Live Gemini +1 more

6Google Deepmind Blog·May 19, 2026·source ↗

Gemini 3.1 Flash Live: Making audio AI more natural and reliable

DeepMind has released Gemini 3.1 Flash Live, a new voice model designed for real-time audio interactions. The model features improved precision and lower latency compared to its predecessor, aiming to make voice-based AI interactions more fluid and natural. The announcement comes from DeepMind's official blog, indicating a production-grade release.

Frontier Model Releases Inference Economics Google DeepMind Gemini 3.1 Flash Live +1 more