model
CriticGPT
modelactive
criticgpt-a8576b20·1 events·first seen 28d agoAliases: CriticGPT
Co-occurring entities
More like this (12)
Recent events (1)
Finding GPT-4's Mistakes with GPT-4: CriticGPT
OpenAI has developed CriticGPT, a GPT-4-based model trained to write critiques of ChatGPT outputs, helping human trainers identify errors during RLHF. The system is designed to address a core scalable oversight challenge: human raters often miss subtle mistakes in long or complex model outputs. CriticGPT-assisted trainers outperformed unassisted trainers in catching model errors, suggesting a path toward more reliable RLHF pipelines.