Almanac
model

CriticGPT

modelactivecriticgpt-a8576b20·1 events·first seen 28d ago

Aliases: CriticGPT

Co-occurring entities

More like this (12)

Recent events (1)

7Openai Blog·28d ago·source ↗

Finding GPT-4's Mistakes with GPT-4: CriticGPT

OpenAI has developed CriticGPT, a GPT-4-based model trained to write critiques of ChatGPT outputs, helping human trainers identify errors during RLHF. The system is designed to address a core scalable oversight challenge: human raters often miss subtle mistakes in long or complex model outputs. CriticGPT-assisted trainers outperformed unassisted trainers in catching model errors, suggesting a path toward more reliable RLHF pipelines.