Almanac
← Events
5OpenAI Release Notes·2d ago

OpenAI releases gpt-realtime-1.5 and gpt-audio-1.5 to production APIs

OpenAI has released gpt-realtime-1.5 to the Realtime API and gpt-audio-1.5 to the Chat Completions API. These are incremental model updates to OpenAI's audio and real-time speech capabilities. The release expands developer access to updated audio-capable models through existing API surfaces.

Related guides (3)

Related events (8)

7Openai Blog·1mo ago·source ↗

Introducing gpt-realtime and Realtime API updates

OpenAI is releasing a new speech-to-speech model called gpt-realtime alongside expanded Realtime API capabilities. New features include MCP server support, image input, and SIP phone calling support. These updates extend the Realtime API's utility for voice-driven and multimodal agent applications.

6The Batch·1mo ago·source ↗

OpenAI Updates Audio Models That Reason, Transcribe, and Translate

OpenAI introduced three new audio models in its Realtime API: GPT-Realtime-2 (speech-to-speech with five configurable reasoning effort levels), GPT-Realtime-Translate (70+ input languages), and GPT-Realtime-Whisper (transcription). GPT-Realtime-2 operates as an end-to-end audio model including reasoning, with latency ranging from 1.12 seconds at minimal effort to 2.33 seconds at high effort. Benchmark results are mixed: it leads Scale AI's Audio MultiChallenge and Artificial Analysis Conversational Dynamics but trails Step-Audio R1.1 Realtime and Grok Voice Think Fast 1.0 on speech reasoning and agentic tasks. The configurable reasoning-latency tradeoff is positioned as a key differentiator for voice agent applications.

2Openai Release Notes·2d ago·source ↗

OpenAI updates gpt-realtime-mini and gpt-audio-mini slugs to 2025-12-15 snapshots

OpenAI has updated the rolling model slugs gpt-realtime-mini and gpt-audio-mini to point to the 2025-12-15 model snapshots, replacing the previous 2025-10-06 versions. Developers needing the older snapshots can pin to gpt-realtime-mini-2025-10-06 and gpt-audio-mini-2025-10-06 explicitly. This is a routine API maintenance update affecting audio and realtime model endpoints.

7Latent Space·1mo ago·source ↗

GPT-Realtime-2, GPT-Translate, and new Whisper: OpenAI's new SOTA realtime voice APIs

OpenAI has released a suite of new real-time voice and audio APIs including GPT-Realtime-2, a GPT-Translate model, and an updated Whisper, all positioned as state-of-the-art for real-time voice applications. The releases appear to be part of a broader push to deploy GPT-5 capabilities across multiple product surfaces. Coverage comes from the Latent Space AI News digest, which aggregates and contextualizes the announcements.

6Openai Release Notes·2d ago·source ↗

OpenAI releases Realtime 2, Realtime Translate, and Realtime Whisper for speech-to-speech and streaming audio

OpenAI released three new audio API products: Realtime 2, a speech-to-speech voice model with configurable reasoning for agentic applications; Realtime Translate, for streaming speech translation; and Realtime Whisper, for streaming speech-to-text transcription. The release is accompanied by updated documentation including a dedicated Realtime translation guide and refreshed transcription guidance. These additions expand OpenAI's real-time audio API surface for developers building voice agents and multilingual applications.

5Openai Release Notes·2d ago·source ↗

OpenAI releases gpt-5.3-chat-latest to Chat Completions and Responses API

OpenAI has made gpt-5.3-chat-latest available via the Chat Completions and Responses API, pointing to the GPT-5.3 Instant snapshot currently powering ChatGPT. This release gives API developers access to the same model version deployed in the consumer product. The update is a routine API availability expansion rather than a new model capability announcement.

7Openai Blog·1mo ago·source ↗

Advancing voice intelligence with new models in the API

OpenAI is releasing new realtime voice models via its API with capabilities spanning reasoning, translation, and transcription. The announcement targets developers building voice-enabled applications and represents an expansion of OpenAI's voice intelligence offerings beyond the existing Realtime API. The models are positioned to enable more natural and intelligent voice experiences in production deployments.

9Openai Release Notes·2d ago·source ↗

OpenAI releases GPT-5.5 and GPT-5.5 Pro to the API with 1M token context and built-in agentic tools

OpenAI released GPT-5.5 and GPT-5.5 Pro to the Chat Completions and Responses API, positioning them as frontier models for complex professional work and compute-intensive tasks respectively. GPT-5.5 supports a 1M token context window, image input, structured outputs, function calling, built-in computer use, hosted shell, MCP, web search, and Skills. Notable behavioral changes include reasoning effort defaulting to medium and extended-only prompt caching support.