Entity · technique

safe-completions

techniqueactivesafe-completions-9779a072·1 events·first seen May 20, 2026

Aliases: safe-completions

Co-occurring entities

output-centric safety training OpenAI GPT-5.5

More like this (12)

Completions API FIM Completion SafeCoder safetensors Safe Exploration Benchmark output-centric safety training Safety & Preparedness Report sandboxing Safety Gym SafetyKit SafeCtrl-RL joint safety evaluation

Recent events (1)

7Openai Blog·May 20, 2026·source ↗

From hard refusals to safe-completions: toward output-centric safety training

OpenAI introduces a 'safe-completions' approach in GPT-5 that replaces hard refusals with nuanced, output-centric safety training for handling dual-use prompts. Rather than refusing requests outright, the model is trained to produce responses that are both helpful and safe by shaping the content of outputs. This represents a methodological shift in how safety and helpfulness are balanced during training, moving away from binary refusal behavior toward graduated response strategies.

Frontier Model Releases AI Safety Research output-centric safety training OpenAI safe-completions +2 more