5OpenAI Release Notes·2d ago

OpenAI releases gpt-realtime-1.5 and gpt-audio-1.5 to production APIs

OpenAI has released gpt-realtime-1.5 to the Realtime API and gpt-audio-1.5 to the Chat Completions API. These are incremental model updates to OpenAI's audio and real-time speech capabilities. The release expands developer access to updated audio-capable models through existing API surfaces.

Frontier Model Releases Multimodal Progress OpenAI Chat Completions API GPT-Realtime-2 gpt-audio-1.5 OpenAI Realtime API OpenAI

Related guides (3)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race to Build the World's Most Capable AI

Read asBeginner In-depth

OpenAI

OpenAI: The Lab That Made AI a Household Name

Read asBeginner In-depth

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner In-depth

Related events (8)

7Openai Blog·1mo ago·source ↗

Introducing gpt-realtime and Realtime API updates

OpenAI is releasing a new speech-to-speech model called gpt-realtime alongside expanded Realtime API capabilities. New features include MCP server support, image input, and SIP phone calling support. These updates extend the Realtime API's utility for voice-driven and multimodal agent applications.

Frontier Model Releases Inference Economics GPT-Realtime-2 SIP Realtime API +4 more

6The Batch·1mo ago·source ↗

OpenAI Updates Audio Models That Reason, Transcribe, and Translate

OpenAI introduced three new audio models in its Realtime API: GPT-Realtime-2 (speech-to-speech with five configurable reasoning effort levels), GPT-Realtime-Translate (70+ input languages), and GPT-Realtime-Whisper (transcription). GPT-Realtime-2 operates as an end-to-end audio model including reasoning, with latency ranging from 1.12 seconds at minimal effort to 2.33 seconds at high effort. Benchmark results are mixed: it leads Scale AI's Audio MultiChallenge and Artificial Analysis Conversational Dynamics but trails Step-Audio R1.1 Realtime and Grok Voice Think Fast 1.0 on speech reasoning and agentic tasks. The configurable reasoning-latency tradeoff is positioned as a key differentiator for voice agent applications.

Frontier Model Releases Evaluation and Benchmarking Scale AI Audio MultiChallenge GPT-Realtime-2 Google +14 more

2Openai Release Notes·2d ago·source ↗

OpenAI updates gpt-realtime-mini and gpt-audio-mini slugs to 2025-12-15 snapshots

OpenAI has updated the rolling model slugs gpt-realtime-mini and gpt-audio-mini to point to the 2025-12-15 model snapshots, replacing the previous 2025-10-06 versions. Developers needing the older snapshots can pin to gpt-realtime-mini-2025-10-06 and gpt-audio-mini-2025-10-06 explicitly. This is a routine API maintenance update affecting audio and realtime model endpoints.

Frontier Model Releases OpenAI gpt-audio-mini gpt-realtime-mini

7Latent Space·1mo ago·source ↗

GPT-Realtime-2, GPT-Translate, and new Whisper: OpenAI's new SOTA realtime voice APIs

OpenAI has released a suite of new real-time voice and audio APIs including GPT-Realtime-2, a GPT-Translate model, and an updated Whisper, all positioned as state-of-the-art for real-time voice applications. The releases appear to be part of a broader push to deploy GPT-5 capabilities across multiple product surfaces. Coverage comes from the Latent Space AI News digest, which aggregates and contextualizes the announcements.

Frontier Model Releases Agent and Tool Ecosystem GPT-Realtime-2 OpenAI Whisper +3 more

6Openai Release Notes·2d ago·source ↗

OpenAI releases Realtime 2, Realtime Translate, and Realtime Whisper for speech-to-speech and streaming audio

OpenAI released three new audio API products: Realtime 2, a speech-to-speech voice model with configurable reasoning for agentic applications; Realtime Translate, for streaming speech translation; and Realtime Whisper, for streaming speech-to-text transcription. The release is accompanied by updated documentation including a dedicated Realtime translation guide and refreshed transcription guidance. These additions expand OpenAI's real-time audio API surface for developers building voice agents and multilingual applications.

Agent and Tool Ecosystem Multimodal Progress GPT-Realtime-Translate OpenAI GPT-Realtime-Whisper +1 more

5Openai Release Notes·2d ago·source ↗

OpenAI releases gpt-5.3-chat-latest to Chat Completions and Responses API

OpenAI has made gpt-5.3-chat-latest available via the Chat Completions and Responses API, pointing to the GPT-5.3 Instant snapshot currently powering ChatGPT. This release gives API developers access to the same model version deployed in the consumer product. The update is a routine API availability expansion rather than a new model capability announcement.

Frontier Model Releases Inference Economics Responses API ChatGPT OpenAI +2 more

7Openai Blog·1mo ago·source ↗

Advancing voice intelligence with new models in the API

OpenAI is releasing new realtime voice models via its API with capabilities spanning reasoning, translation, and transcription. The announcement targets developers building voice-enabled applications and represents an expansion of OpenAI's voice intelligence offerings beyond the existing Realtime API. The models are positioned to enable more natural and intelligent voice experiences in production deployments.

Frontier Model Releases Enterprise Deployment Patterns OpenAI voice models OpenAI Realtime API OpenAI +1 more

9Openai Release Notes·2d ago·source ↗

OpenAI releases GPT-5.5 and GPT-5.5 Pro to the API with 1M token context and built-in agentic tools

OpenAI released GPT-5.5 and GPT-5.5 Pro to the Chat Completions and Responses API, positioning them as frontier models for complex professional work and compute-intensive tasks respectively. GPT-5.5 supports a 1M token context window, image input, structured outputs, function calling, built-in computer use, hosted shell, MCP, web search, and Skills. Notable behavioral changes include reasoning effort defaulting to medium and extended-only prompt caching support.

Long Context Evolution Frontier Model Releases Responses API GPT Pro OpenAI +4 more