Entity · technique

optimism bias

techniqueactiveoptimism-bias-9c476321·1 events·first seen May 29, 2026

Aliases: optimism bias

Co-occurring entities

More like this (12)

OptimismBench OptimismBench: Forecasting Bias and the Alignment Effect in Language Model Judgment adaptive thinking Hope-attention Pessimism's Paradox: Conservative Offline Training Amplifies Reward Hacking During Online Adaptation in Reasoning Models Generalization in offline RL: The structure is more important than the amount of pessimism Perceptual Judgment Bias optimization theory geopolitical bias Cumulative Prospect Theory Expert Blindness Effect counterfactual reasoning

Recent events (1)

6arXiv · cs.LG·May 29, 2026·source ↗

SoundnessBench: Benchmarking LLMs as Evaluators of ML Research Proposal Viability

SoundnessBench is a new benchmark of 1,099 machine-learning research proposals derived from ICLR submissions, labeled with reviewer soundness scores, designed to test whether LLMs can reliably distinguish methodologically sound research ideas from unsound ones. Evaluated across 12 frontier LLMs, the benchmark reveals a pervasive optimism bias: models systematically rate low-soundness proposals as sound under standard prompting, with aggressive prompting shifting errors from false positives to false negatives rather than eliminating them. Controls for data contamination, surface features, and human audit quality suggest the bias is not attributable to a single confounder. The authors conclude that current LLMs are not yet reliable as standalone first-gate evaluators of scientific rigor, a critical bottleneck for autonomous AI research agents.

Evaluation and Benchmarking AI Safety Research ICLR optimism bias SoundnessBench +1 more