Entity · model

GPT-4

modelactiveprovisionalgpt-4-5a7c19b8·24 events·first seen May 20, 2026

Aliases: GPT-4

Co-occurring entities

More like this (12)

GPT-4.1 GPT-4V GPT-4o GPT-4 Turbo GPT-3 GPT-2 GPT-4b micro GPT-1 GPT-5.2 GPT GPT-5.5 GPT-4.1 mini

Recent events (24)

6Berkeley Ai Research (Bair) Blog·Jul 7, 2026·source ↗

BAIR perspective: data systems must be redesigned for, of, and by AI agents as inference costs approach zero

UC Berkeley EECS professor Aditya Parameswaran and collaborators publish a landscape survey and perspective on the implications of near-zero AI inference costs for data systems, arguing that agents will soon become the dominant workload. The piece identifies three research challenges: redesigning databases for agentic query patterns (including 'agentic speculation' generating thousands of SQL queries per user request), building infrastructure to manage and coordinate agent swarms over long-running tasks, and verifying data systems synthesized by agents. Concrete findings include that 80-90% of sub-queries from multi-agent text-to-SQL workloads are redundant, motivating new multi-query optimization and approximate query processing approaches. The post draws on the authors' own ongoing research directions including structured memory and agent-synthesized data systems.

Training Infrastructure Inference Economics Berkeley AI Research (BAIR)UC Berkeley Aditya G. Parameswaran +3 more

5arXiv · cs.AI·Jul 3, 2026·source ↗

Eticas AI Risk Taxonomy v2.0.0: Open infrastructure for operationalizing AI audits

Eticas presents a structured AI auditing framework that bridges risk cataloging to executable audit methodology, demonstrated end-to-end on PII leakage testing against GPT-4-0314. The taxonomy organizes 76 active subcategories across 10 categories with mappings to 18 external frameworks, and is published under CC BY 4.0 with SKOS/JSON-LD distributions. The key contribution is an operationalization layer that converts named risks into measurable, severity-graded findings — addressing a gap the authors identify across at least 74 existing AI risk taxonomies. The PII leakage demonstration shows disclosure rates ranging from 0% to 84% under adversarial conditioning, graded as SYSTEMIC severity.

Evaluation and Benchmarking AI Safety Research SKOS GPT-4 Eticas +1 more

5arXiv · cs.CL·Jun 30, 2026·source ↗

Multi-agent system using open-source LLMs outperforms GPT-4 on disinformation detection

A new arXiv preprint proposes a multi-agent system for automated disinformation detection that emulates human annotator decision-making through consensus mechanisms, cognitive diversity, and hierarchical structure. The system uses open-source models (LLaMA, Kimi, Qwen, DeepSeek, LLaMA-Nemotron) and is evaluated on English, Polish, Slovak, and Bulgarian datasets across three fact-checking tasks. Results claim superior performance over individual LLMs including GPT-4 and GPT-3.5, with transparency benefits from using open weights models.

Open Weights Progress Agent and Tool Ecosystem Llama Nemotron Kimi DeepSeek V4 +5 more

4arXiv · cs.CL·Jun 10, 2026·source ↗

Pipeline detects curriculum knowledge gaps from student-AI conversational logs using prerequisite graphs

Researchers present a pipeline that classifies student questions directed at a conversational AI teaching assistant into curriculum topics using a few-shot classifier grounded in a GPT-4-extracted prerequisite knowledge graph. Evaluated on 1,340 questions from 164 graduate students, the classifier achieves 80% accuracy across 43 labels. Topic-level question volume significantly correlates with student-reported difficulty (rho=0.491), validating that AI interaction logs carry actionable diagnostic signals about knowledge gaps.

Detecting Knowledge Gaps from Conversational AI Interactions Using Curriculum Prerequisite Graphs OpenAI GPT-4

8Mistral Ai News·Jun 1, 2026·source ↗

Mistral AI Releases Mistral Large, Claims Second-Best API Model After GPT-4

Mistral AI has released Mistral Large, its most capable model to date, claiming second place among API-accessible models behind GPT-4 on standard benchmarks including MMLU, HellaSwag, and coding/math evals. The model features a 32K context window, native fluency in five European languages, function calling, and constrained output mode. Simultaneously, Mistral is launching a new Mistral Small optimized for latency, restructuring its endpoint lineup, and announcing Microsoft Azure as its first major distribution partner. This marks Mistral's first significant commercial partnership and expansion beyond its own infrastructure.

Long Context Evolution Frontier Model Releases Azure AI Studio Mistral AI Llama 2 70B +13 more

9Openai Blog·May 20, 2026·source ↗

GPT-4 Release

OpenAI released GPT-4, a large multimodal model accepting image and text inputs and producing text outputs. The model demonstrates human-level performance on various professional and academic benchmarks. It represents OpenAI's latest milestone in scaling deep learning.

Frontier Model Releases Evaluation and Benchmarking OpenAI GPT-4 +1 more

5Openai Blog·May 20, 2026·source ↗

Duolingo Integrates GPT-4 for Deeper Language Learning Conversations

OpenAI announced a partnership with Duolingo to integrate GPT-4 into the language learning platform. The integration aims to fill gaps in conversational practice and explanation that traditional language learning apps struggle to provide. This represents one of the early enterprise deployments of GPT-4 at scale in an educational context.

Enterprise Deployment Patterns Agent and Tool Ecosystem Duolingo Duolingo Max OpenAI +1 more

5Openai Blog·May 20, 2026·source ↗

Be My Eyes Integrates GPT-4 for Visual Accessibility

Be My Eyes, a visual assistance app for blind and low-vision users, has integrated GPT-4 to enhance its accessibility capabilities. The partnership leverages GPT-4's multimodal vision features to provide richer, AI-powered visual interpretation for users. This represents an early real-world deployment of GPT-4's vision capabilities in an assistive technology context.

Enterprise Deployment Patterns Multimodal Progress Be My Eyes OpenAI GPT-4

5Openai Blog·May 20, 2026·source ↗

Iceland Government Uses GPT-4 for Icelandic Language Preservation

The Government of Iceland is partnering with OpenAI to use GPT-4 for preserving the Icelandic language. This initiative represents an early government-level deployment of a frontier language model for cultural and linguistic preservation purposes. The effort highlights GPT-4's multilingual capabilities and its application to low-resource or endangered language contexts.

Frontier Model Releases Enterprise Deployment Patterns Government of Iceland OpenAI GPT-4 +1 more

5Openai Blog·May 20, 2026·source ↗

Stripe Leverages GPT-4 to Streamline User Experience and Combat Fraud

Stripe has integrated GPT-4 into its platform to improve user experience and enhance fraud detection capabilities. This represents an early enterprise deployment of GPT-4 coinciding with its launch on March 14, 2023. The partnership demonstrates a major fintech company adopting frontier AI models for both customer-facing and security applications.

Frontier Model Releases Enterprise Deployment Patterns Stripe OpenAI GPT-4

5Openai Blog·May 20, 2026·source ↗

Khan Academy Explores GPT-4 in Limited Pilot Program

OpenAI announced a partnership with Khan Academy to pilot GPT-4 in educational settings. The initiative explores using GPT-4 to power virtual tutoring and classroom assistance tools. This represents an early enterprise deployment of GPT-4 in the education sector.

Frontier Model Releases Enterprise Deployment Patterns Khan Academy OpenAI GPT-4

7Openai Blog·May 20, 2026·source ↗

GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models

OpenAI published research examining the potential labor market impacts of large language models, analyzing which occupations and tasks are most exposed to automation or augmentation by GPT-class models. The study introduces a framework for assessing LLM 'exposure' across job categories, finding that a significant share of U.S. workers could see at least 50% of their tasks affected. The paper represents an early systematic attempt to quantify economic disruption potential from frontier AI systems.

Evaluation and Benchmarking Enterprise Deployment Patterns Sam Manning Tyna Eloundou OpenAI +5 more

6Openai Blog·May 20, 2026·source ↗

Language models can explain neurons in language models

OpenAI uses GPT-4 to automatically generate and score natural-language explanations for the behavior of individual neurons in large language models. The methodology is applied to all neurons in GPT-2, producing a public dataset of explanations and quality scores. The authors acknowledge the explanations are imperfect, framing this as an early step toward automated mechanistic interpretability. This work establishes a scalable pipeline for neuron-level analysis that could inform future interpretability and safety research.

Evaluation and Benchmarking AI Safety Research GPT-2 automated mechanistic interpretability neuron explanation dataset +2 more

8Openai Blog·May 20, 2026·source ↗

OpenAI Announces Function Calling, Longer Context, and API Price Reductions

OpenAI introduced function calling capabilities to its API, enabling models to reliably output structured JSON for calling developer-defined functions. The update also includes longer context windows, more steerable models (gpt-3.5-turbo-16k and gpt-4 updates), and reduced pricing on several API tiers. These changes significantly expand the practical utility of OpenAI models for agentic and tool-use applications.

Long Context Evolution Frontier Model Releases GPT-3.5 Turbo OpenAI API OpenAI +4 more

6Openai Blog·May 20, 2026·source ↗

Using GPT-4 for Content Moderation

OpenAI describes using GPT-4 to assist with content policy development and moderation decisions, replacing or reducing human moderator involvement. The approach aims to improve labeling consistency and accelerate policy iteration cycles. This represents a practical deployment of a frontier model in a high-stakes operational role within OpenAI itself.

AI Safety Research Enterprise Deployment Patterns OpenAI GPT-4

7Openai Blog·May 20, 2026·source ↗

GPT-4V(ision) System Card

OpenAI published the system card for GPT-4V(ision), the multimodal extension of GPT-4 that accepts image inputs alongside text. The document covers capability evaluations, safety assessments, and known limitations of the vision-enabled model. It represents OpenAI's formal safety and transparency disclosure accompanying the GPT-4V release.

Frontier Model Releases Evaluation and Benchmarking GPT-4V OpenAI GPT-4 +2 more

7Openai Blog·May 20, 2026·source ↗

Building an Early Warning System for LLM-Aided Biological Threat Creation

OpenAI published a blueprint for evaluating whether LLMs can meaningfully assist in biological threat creation. In a controlled study with biology experts and students, GPT-4 was found to provide at most mild uplift in biological threat creation accuracy. The results are inconclusive but are framed as a starting point for ongoing safety research and community deliberation on biosecurity risks from AI.

Evaluation and Benchmarking AI Safety Research biological threat creation evaluation OpenAI GPT-4

6Openai Blog·May 20, 2026·source ↗

OpenAI Opens First Asia Office in Japan, Releases Japanese-Optimized GPT-4 Custom Model

OpenAI has announced the opening of its first Asian office in Japan, marking a significant geographic expansion. Alongside the office launch, OpenAI is releasing a custom GPT-4 model specifically optimized for the Japanese language. This represents both a strategic business move into the Asia-Pacific market and a technical effort to improve model performance for non-English languages.

Frontier Model Releases Enterprise Deployment Patterns OpenAI Japan OpenAI GPT-4

7Openai Blog·May 20, 2026·source ↗

GPT-4 API General Availability and Completions API Deprecation Plan

OpenAI has announced general availability of the GPT-4 API, alongside GPT-3.5 Turbo, DALL·E, and Whisper APIs. Concurrently, OpenAI is releasing a deprecation plan for older models in the Completions API, which are set to retire at the beginning of 2024. This marks a significant milestone in OpenAI's API product lifecycle, transitioning GPT-4 from limited access to broad developer availability.

Frontier Model Releases Inference Economics GPT-3.5 Turbo DALL·E 3 OpenAI +4 more

9Openai Blog·May 20, 2026·source ↗

Hello GPT-4o

OpenAI announces GPT-4o (Omni), a new flagship multimodal model capable of reasoning across audio, vision, and text in real time. The model represents a significant step toward natively multimodal AI, processing and generating across modalities without separate pipeline stages. It is positioned as OpenAI's primary production model going forward.

Frontier Model Releases Inference Economics GPT-4o OpenAI GPT-4 +1 more

7Openai Blog·May 20, 2026·source ↗

Extracting Concepts from GPT-4: 16 Million Patterns via Sparse Autoencoders

OpenAI applied scaled sparse autoencoders (SAEs) to GPT-4 to automatically identify approximately 16 million interpretable features or patterns in the model's internal computations. This represents a significant scaling of mechanistic interpretability techniques previously demonstrated on smaller models. The work advances the ability to understand what concepts and representations large frontier models encode internally.

Evaluation and Benchmarking AI Safety Research mechanistic interpretability Sparse Autoencoder OpenAI +1 more

7Openai Blog·May 20, 2026·source ↗

Finding GPT-4's Mistakes with GPT-4: CriticGPT

OpenAI has developed CriticGPT, a GPT-4-based model trained to write critiques of ChatGPT outputs, helping human trainers identify errors during RLHF. The system is designed to address a core scalable oversight challenge: human raters often miss subtle mistakes in long or complex model outputs. CriticGPT-assisted trainers outperformed unassisted trainers in catching model errors, suggesting a path toward more reliable RLHF pipelines.

Evaluation and Benchmarking AI Safety Research ChatGPT CriticGPT Reinforcement Learning from Human Feedback +4 more

4Openai Blog·May 20, 2026·source ↗

Ada Uses GPT-4 to Deliver a New Customer Service Standard

Ada, a customer service platform, has integrated GPT-4 to power its AI-driven support capabilities. The announcement, published on OpenAI's blog, highlights the deployment of GPT-4 in an enterprise customer service context. This represents a concrete enterprise deployment case study for GPT-4 in production customer-facing workflows.

Enterprise Deployment Patterns Ada OpenAI GPT-4

3Openai Blog·May 20, 2026·source ↗

Using GPT-4 to Improve Teaching and Learning in Brazil

OpenAI has partnered with Arco Education, a Brazilian edtech company, to deploy GPT-4 in educational settings across Brazil. The initiative aims to enhance teaching and learning outcomes by integrating large language model capabilities into Arco's existing platforms. This represents an enterprise deployment of GPT-4 in the Latin American education sector.

Enterprise Deployment Patterns OpenAI GPT-4 Arco Education