6OpenAI Blog·1mo ago

Using GPT-4 for Content Moderation

OpenAI describes using GPT-4 to assist with content policy development and moderation decisions, replacing or reducing human moderator involvement. The approach aims to improve labeling consistency and accelerate policy iteration cycles. This represents a practical deployment of a frontier model in a high-stakes operational role within OpenAI itself.

AI Safety Research Enterprise Deployment Patterns OpenAI GPT-4

Related guides (3)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From AI Demo to Production Reality

Read asBeginner In-depth

Related events (8)

5Openai Blog·1mo ago·source ↗

OpenAI Upgrades Moderation API with GPT-4o-Based Multimodal Model

OpenAI has released an updated Moderation API powered by a new model built on GPT-4o, extending content moderation capabilities to both text and images. The update aims to improve accuracy in detecting harmful content, giving developers better tools for building moderation systems. This represents an expansion of OpenAI's safety infrastructure into multimodal domains.

AI Safety Research Enterprise Deployment Patterns GPT-4o OpenAI Moderation API OpenAI +1 more

9Openai Blog·1mo ago·source ↗

GPT-4 Release

OpenAI released GPT-4, a large multimodal model accepting image and text inputs and producing text outputs. The model demonstrates human-level performance on various professional and academic benchmarks. It represents OpenAI's latest milestone in scaling deep learning.

Frontier Model Releases Evaluation and Benchmarking OpenAI GPT-4 +1 more

5Openai Blog·1mo ago·source ↗

How should AI systems behave, and who should decide?

OpenAI published a policy post clarifying how ChatGPT's behavior is shaped and governed, outlining plans to allow greater user customization of model behavior. The post also describes intentions to solicit broader public input into decision-making around AI system behavior. This represents an early public articulation of OpenAI's approach to behavioral governance and value alignment in deployed systems.

Enterprise Deployment Patterns Alignment and RLHF ChatGPT OpenAI

7Openai Blog·1mo ago·source ↗

GPT-4o System Card

OpenAI published the system card for GPT-4o, its flagship multimodal model. The document covers safety evaluations, capability assessments, and risk mitigations conducted prior to deployment. It provides transparency into the model's performance across modalities including text, audio, and vision, as well as alignment and red-teaming findings.

Frontier Model Releases Evaluation and Benchmarking GPT-4o OpenAI +3 more

8Openai Blog·1mo ago·source ↗

Introducing GPT-4o and More Tools to ChatGPT Free Users

OpenAI is launching GPT-4o, its newest flagship model, and expanding access to additional capabilities for free-tier ChatGPT users. This represents a significant democratization move, bringing frontier model capabilities to users without a paid subscription. The announcement signals OpenAI's strategy to broaden its user base while maintaining competitive pressure on rivals.

Frontier Model Releases Inference Economics ChatGPT GPT-4o OpenAI +1 more

9Openai Blog·1mo ago·source ↗

Introducing GPT-5.4

OpenAI has released GPT-5.4, described as their most capable and efficient frontier model targeting professional work. The model features state-of-the-art coding, computer use, and tool search capabilities, along with a 1 million token context window. This represents a significant capability and efficiency advancement over prior GPT-5 series models.

Long Context Evolution Frontier Model Releases OpenAI computer use 1M-token context +3 more

4Openai Blog·1mo ago·source ↗

OpenAI Launches Free Moderation Endpoint for API Developers

OpenAI introduced a new Moderation endpoint as a free tool for API developers, replacing its previous content filter. The endpoint is designed to help developers detect and filter harmful or policy-violating content in their applications. This represents an incremental improvement to OpenAI's content moderation infrastructure.

AI Safety Research Enterprise Deployment Patterns OpenAI Moderation Endpoint OpenAI API OpenAI

7Openai Blog·1mo ago·source ↗

Finding GPT-4's Mistakes with GPT-4: CriticGPT

OpenAI has developed CriticGPT, a GPT-4-based model trained to write critiques of ChatGPT outputs, helping human trainers identify errors during RLHF. The system is designed to address a core scalable oversight challenge: human raters often miss subtle mistakes in long or complex model outputs. CriticGPT-assisted trainers outperformed unassisted trainers in catching model errors, suggesting a path toward more reliable RLHF pipelines.

Evaluation and Benchmarking AI Safety Research ChatGPT CriticGPT Reinforcement Learning from Human Feedback +4 more