Quantifying Faithful Confidence Expression in Large Reasoning Models
quantifying-faithful-confidence-expression-in-large-reasoning-models-3f5f4d11·1 events·first seen 13d agoAliases: Quantifying Faithful Confidence Expression in Large Reasoning Models
More like this (12)
Recent events (1)
Framework for quantifying faithful confidence expression in large reasoning models
A new arXiv preprint introduces a framework to measure faithful calibration (FC) in large reasoning models (LRMs)—the alignment between a model's intrinsic confidence and its linguistically expressed confidence. The authors analyze linguistic decisiveness against three internal uncertainty sources (token probabilities, hidden states, sampled response consistency) and introduce prefix-conditioned sampling to handle structural variation in chain-of-thought traces. Applying the framework across leading models, they find FC is a significant and distinct failure mode for LRMs: extended reasoning traces do not automatically improve calibration, prompt interventions that help non-reasoning models fail in the reasoning setting, and different confidence estimators produce divergent assessments of the same traces.