Entity · model

GPT-5.2

SupersededGPT-5.5 is the current OpenAI flagship model (supersedes GPT-5.4 and GPT-5.2). See GPT-5.5 →

modelactivegpt-5-2-5ad34f86·21 events·first seen May 20, 2026

Aliases: GPT-5.2

Co-occurring entities

More like this (12)

GPT-5.5 GPT-5.3 GPT Pro GPT-4.1 GPT-5.2-high GPT-5.5-Cyber GPT-4 GPT-5.4 mini GPT-2 GPT-4V GPT-5.4 nano GPT-1

Recent events (21)

4arXiv · cs.CL·4d ago·source ↗

SINT-Flow: LLM workflow framework for automated schema integration with new SINT-Bench benchmark

SINT-Flow is a schema integration framework using five LLM-based operators composed into workflows to perform fully automated end-to-end schema integration, including decomposition of denormalized tables describing multiple entity types. The authors also introduce SINT-Bench, a benchmark of 10 schema integration tasks across 93 relational tables. Evaluation using GPT-5.2 and Qwen-3.6-27B as backbone models achieves F1 scores of 96%+ for entity-type detection, 85% for attribute detection, and 83% for schema mapping. The work demonstrates LLM applicability to a classical database integration problem with a self-consistency strategy and review loop ablation.

Evaluation and Benchmarking Agent and Tool Ecosystem SINT-Flow GPT-5.2 Qwen 3.5 27B +1 more

6arXiv · cs.AI·4d ago·source ↗

ClinFusion: Vision-centric multimodal LLM for holistic medical image understanding

Researchers introduce ClinFusion, a multimodal LLM system designed for clinical medical understanding, featuring a cascaded vision encoder architecture (Cascade Spatial-Aware Locality Fusion) that handles both 2D and native 3D medical images within a unified encoder. The system is evaluated on a new benchmark suite including MedIF-Bench and a region-of-interest-grounded report generation metric, claiming state-of-the-art results on 20 of 24 benchmarks against open-source medical MLLMs and outperforming GPT-5.2 and Gemini-3-Flash on 13 of 16 multimodal benchmarks. Blinded evaluation by board-certified radiologists confirms ClinFusion produces the highest-ranked radiology reports, and the proposed RoI-grounded metric shows the strongest correlation with expert judgment among automatic metrics tested.

Evaluation and Benchmarking Agent and Tool Ecosystem MedIF-Bench GPT-5.2 Cascade Spatial-Aware Locality Fusion +7 more

6Openai Release Notes·Jul 1, 2026·source ↗

OpenAI releases GPT-5.2-Codex to Responses API

OpenAI has released GPT-5.2-Codex via the Responses API, a variant of GPT-5.2 specifically optimized for agentic coding tasks in Codex and similar environments. The release extends the GPT-5.2 model family with a coding-specialized deployment. This is a tier-1 announcement from OpenAI's official release notes.

Frontier Model Releases Agent and Tool Ecosystem GPT-5.3-Codex Responses API GPT-5.2 +2 more

5Openai Release Notes·Jul 1, 2026·source ↗

OpenAI inference stack optimization delivers ~40% speed increase for GPT-5.2 and GPT-5.2-Codex

OpenAI has optimized its inference stack for API customers, resulting in approximately 40% faster inference for GPT-5.2 and GPT-5.2-Codex. Model weights are unchanged, meaning the improvement is purely infrastructural. This is a meaningful latency reduction for production API users without any model capability tradeoff.

Inference Economics GPT-5.3-Codex GPT-5.2 OpenAI

3Openai Release Notes·Jul 1, 2026·source ↗

OpenAI restores Extended thinking level for GPT-5.2 Thinking in ChatGPT after inadvertent reduction

OpenAI disclosed a series of adjustments to thinking time settings for GPT-5.2 Thinking in ChatGPT, including an unintentional reduction of the Extended thinking level in January 2026 that has now been corrected. Standard and Light thinking times were also deliberately reduced based on user preference data favoring faster responses. The update clarifies that thinking time is tuned independently per model and is not cross-model comparable, and that a thinking level toggle introduced in September 2025 gives users explicit control over the tradeoff between speed and reasoning depth.

Frontier Model Releases Inference Economics GPT-5.2 ChatGPT OpenAI

3Openai Release Notes·Jul 1, 2026·source ↗

OpenAI updates gpt-5.2-chat-latest slug to current ChatGPT model

OpenAI updated the gpt-5.2-chat-latest API alias to point to the latest model version deployed in ChatGPT. This is a routing change to an API slug rather than a new model release, but signals a model version update in production. The change is relevant for API consumers who rely on the latest-slug for automatic model tracking.

Frontier Model Releases GPT-5.2 ChatGPT OpenAI

4Openai Release Notes·Jul 1, 2026·source ↗

OpenAI deprecates older GPT-5 and Codex model variants in ChatGPT

OpenAI is removing six model variants from the ChatGPT model picker starting April 7, 2026, including gpt-5.2-codex, gpt-5.1-codex-mini, gpt-5.1-codex-max, gpt-5.1-codex, gpt-5.1, and gpt-5, with full removal from Codex on April 14. Users are directed toward gpt-5.4, gpt-5.4-mini, gpt-5.3-codex, and gpt-5.2 as the supported options, with gpt-5.3-codex-spark available to Pro subscribers. The update signals OpenAI consolidating its model lineup around newer GPT-5.x variants and retiring earlier iterations.

Frontier Model Releases Inference Economics GPT-5.3-Codex GPT-5.2 ChatGPT +5 more

3Openai Release Notes·Jul 1, 2026·source ↗

OpenAI retires GPT-5.2 models from ChatGPT, migrating users to GPT-5.5

As of June 12, 2026, OpenAI has removed GPT-5.2 Instant, GPT-5.2 Thinking, and GPT-5.2 Pro from ChatGPT, with existing conversations automatically continuing on corresponding GPT-5.5 models. The retirement follows OpenAI's stated policy of keeping models available for approximately 90 days after a successor is released. This reflects OpenAI's ongoing model lifecycle management as the GPT-5.x generation matures.

Frontier Model Releases GPT-5.2 ChatGPT OpenAI +1 more

4arXiv · cs.CL·Jun 23, 2026·source ↗

P4IR framework uses SFT + GRPO to improve LLM-based automated building code compliance

Researchers introduce P4IR, a two-stage framework combining supervised fine-tuning (SFT) and Group Relative Policy Optimization (GRPO) to improve LLM accuracy in automated code compliance (ACC) for building regulations. The approach reduces tree edit distance and token-level Levenshtein distance by up to 23.8% and 38.6% respectively versus SFT baselines, and outperforms Claude Opus/Sonnet 4.5, GPT-5.2, Qwen-3-Max, and GLM-4.7 in zero-shot settings. The work targets a narrow but practically important domain where LLM hallucinations carry real regulatory consequences.

Enterprise Deployment Patterns Alignment and RLHF GPT-5.2 Claude Opus 4.6 Claude Sonnet 4.5 +4 more

5arXiv · cs.CL·Jun 17, 2026·source ↗

TAC benchmark finds frontier AI agents systematically book animal-exploitative travel options below chance rate

Researchers introduce TAC (Travel Agent Compassion), the first agentic benchmark testing whether AI agents avoid animal-exploitative options when booking travel on behalf of users. Across 48 scenarios spanning six exploitation categories, all seven evaluated frontier models score below the 64% chance baseline, with the best performer (Claude Opus 4.7) at 53%. A single welfare-aware sentence in the system prompt yields dramatic gains in Claude and GPT-5.5 (47-63 percentage points) but minimal effect on DeepSeek and Gemini models. The study highlights a gap between models' text-response welfare reasoning and their agentic decision-making behavior.

Evaluation and Benchmarking AI Safety Research GPT-5.2 Claude Opus 4.6 DeepSeek V4 +8 more

7The Batch·Jun 2, 2026·source ↗

Alibaba releases Qwen3.5 open-weights vision-language model family with MoE architecture across eight sizes

Alibaba released the Qwen3.5 family of eight open-weights vision-language models ranging from 0.8B to 397B parameters, built on a mixture-of-experts architecture with mixed attention and Gated DeltaNet layers. The flagship Qwen3.5-397B-A17B outperforms GPT-5.2, Claude 4.5 Opus, and Gemini-3 Pro on 28 of 44 vision benchmarks, while the 9B model surpasses OpenAI's gpt-oss-120B on most language tasks. Open weights are available under Apache 2.0, with hosted agentic variants (Qwen3.5-Plus, Qwen3.5-Flash) available via Alibaba Cloud. The release is notable for strong small-model efficiency and comes amid reported team departures following the Qwen3 rollout.

Frontier Model Releases Open Weights Progress GPT-5.2 Alibaba Cloud Model Studio Claude Opus 4.6 +10 more

9Anthropic News·Jun 1, 2026·source ↗

Claude Opus 4.6 Released with 1M Token Context, Agentic Coding Advances, and State-of-the-Art Benchmarks

Anthropic has released Claude Opus 4.6, its most capable model to date, featuring a 1M token context window in beta, improved agentic coding and planning capabilities, and adaptive thinking with developer-controlled effort levels. The model claims top scores on Terminal-Bench 2.0, Humanity's Last Exam, GDPval-AA, and BrowseComp, outperforming OpenAI's GPT-5.2 by 144 Elo points on GDPval-AA. New product features include agent teams in Claude Code, context compaction for long-running tasks, and Claude in PowerPoint (research preview). Pricing remains unchanged at $5/$25 per million input/output tokens.

Long Context Evolution Frontier Model Releases GPT-5.2 Claude Opus 4.6 adaptive thinking +13 more

4arXiv · cs.CL·Jun 1, 2026·source ↗

Benchmarking Local LLMs for Confidential Translation Workflows

This paper evaluates locally runnable LLMs (via Ollama) for offline, privacy-constrained translation workflows targeting freelance translators and smaller language service providers. The authors expand their Reeve Foundation corpus to include German and Simplified Chinese, then benchmark local models across four language directions against commercial NMTs (DeepL, Baidu), a frontier LLM (GPT-5.2), and professional local NMT systems. Results show substantial performance variation by language direction and model size, with the best local LLMs matching or exceeding local NMT systems and the frontier LLM, though falling short of top commercial NMTs. The study supports the viability of local LLMs for confidentiality-sensitive translation use cases.

Evaluation and Benchmarking Open Weights Progress Ollama GPT-5.2 DeepL +8 more

7Openai Blog·May 20, 2026·source ↗

OpenAI Releases GPT-5.2 System Card Update

OpenAI has published a system card update for GPT-5.2, the latest model family in the GPT-5 series. The safety mitigation approach is described as largely consistent with the prior GPT-5 and GPT-5.1 system cards. Training data sources follow the same pattern as other OpenAI models: publicly available internet data, third-party partnerships, and user/researcher-generated content.

Frontier Model Releases AI Safety Research GPT-5.2 OpenAI GPT-5.5 System Card +1 more

9Openai Blog·May 20, 2026·source ↗

Introducing GPT-5.2

OpenAI has released GPT-5.2, described as their most advanced frontier model for professional use, featuring state-of-the-art reasoning, long-context understanding, coding, and vision capabilities. The model is available through ChatGPT and the OpenAI API. It is positioned to support faster and more reliable agentic workflows.

Long Context Evolution Frontier Model Releases GPT-5.2 ChatGPT OpenAI API +4 more

8Openai Blog·May 20, 2026·source ↗

Advancing science and math with GPT-5.2

OpenAI has released GPT-5.2, described as its strongest model for mathematics and science, achieving state-of-the-art results on GPQA Diamond and FrontierMath benchmarks. The announcement highlights practical research applications including solving an open theoretical problem and generating verified mathematical proofs. The post positions GPT-5.2 as a meaningful step toward AI-assisted scientific discovery.

Frontier Model Releases Evaluation and Benchmarking GPT-5.2 FrontierMath GPQA Diamond +2 more

8Openai Blog·May 20, 2026·source ↗

Addendum to GPT-5.2 System Card: GPT-5.2-Codex

OpenAI published a system card addendum for GPT-5.2-Codex, a specialized variant of GPT-5.2 focused on coding capabilities. The document provides safety evaluations, capability assessments, and deployment considerations specific to this coding-oriented model. As a Tier 1 source system card, it represents official documentation of a frontier coding model's properties and risk profile.

Frontier Model Releases AI Safety Research GPT-5.3-Codex GPT-5.2 OpenAI +1 more

5Openai Blog·May 20, 2026·source ↗

Netomi's lessons for scaling agentic systems into the enterprise

Netomi, an enterprise AI customer service platform, shares operational lessons from deploying agentic systems at scale using OpenAI's GPT-4.1 and GPT-5.2 models. The case study covers concurrency management, governance frameworks, and multi-step reasoning in production workflows. This represents a real-world deployment pattern for frontier models in enterprise agentic contexts.

Frontier Model Releases Enterprise Deployment Patterns GPT-5.2 Netomi OpenAI +2 more

7Openai Blog·May 20, 2026·source ↗

Introducing Prism: OpenAI's LaTeX-Native Research Workspace with GPT-5.2

OpenAI has launched Prism, a free LaTeX-native workspace designed for researchers that integrates GPT-5.2 directly into the writing and collaboration environment. The product targets academic and scientific workflows, combining document authoring with AI-assisted reasoning in a single interface. This also marks a public reference to GPT-5.2, indicating a model iteration beyond GPT-5.

Frontier Model Releases Enterprise Deployment Patterns GPT-5.2 LaTeX OpenAI +2 more

8Openai Blog·May 20, 2026·source ↗

GPT-5.3-Codex System Card

OpenAI has released the system card for GPT-5.3-Codex, described as the most capable agentic coding model to date. It combines the frontier coding performance of GPT-5.2-Codex with the reasoning and professional knowledge capabilities of GPT-5.2. The release represents a continuation of OpenAI's Codex line of specialized coding models within the GPT-5 family.

Frontier Model Releases Evaluation and Benchmarking GPT-5.3-Codex GPT-5.2 OpenAI +1 more

9Openai Blog·May 20, 2026·source ↗

GPT-5.2 derives a new result in theoretical physics

A new preprint demonstrates GPT-5.2 proposing a novel formula for a gluon amplitude in theoretical physics, which was subsequently formally proved and verified by OpenAI researchers and academic collaborators. This represents a claimed instance of an AI system producing a genuinely new scientific result rather than reproducing known work. The result was published as a preprint and announced via the OpenAI blog.

Frontier Model Releases Evaluation and Benchmarking GPT-5.2 gluon amplitude OpenAI +2 more