technique
Bootstrap Mode Frequency
techniqueactiveprovisional
bootstrap-mode-frequency-59504269·1 events·first seen 22d agoAliases: Bootstrap Mode Frequency
Co-occurring entities
More like this (12)
Recent events (1)
Confidence and Calibration of Activation Oracles for Reliable Interpretation of Language Model Internals
This paper investigates uncertainty quantification (UQ) for activation oracles—systems that make LLM internal activations human-legible—by evaluating 6 confidence estimation methods across 6,000 samples per oracle. The authors find that bootstrap mode frequency achieves the best calibration (ECE 5.7% vs. 25.5% for log-probability baseline on Qwen3-8B), while the log-prob baseline remains useful as a cheap triage signal. Experiments vary verbalizer and context prompts across two Qwen3 model sizes. Code and a patched trainer are released publicly.