technique
process supervision
techniqueactive
process-supervision-6713cb58·1 events·first seen 28d agoAliases: process supervision
Co-occurring entities
More like this (12)
Recent events (1)
Improving Mathematical Reasoning with Process Supervision
OpenAI trained a model achieving state-of-the-art mathematical problem solving by rewarding each correct reasoning step (process supervision) rather than only the final answer (outcome supervision). This approach improves performance on math benchmarks and carries an alignment benefit by training models to produce human-endorsed chain-of-thought reasoning. The work highlights a potential synergy between capability improvements and alignment techniques.