Almanac
technique

Research Gap Inference

techniqueactiveprovisionalresearch-gap-inference-79478858·1 events·first seen 8d ago

Aliases: Research Gap Inference

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·8d ago·source ↗

Multi-turn evaluation reveals deep research agents fail to compound gains from process-level feedback

A new arXiv paper evaluates deep research agents (DRAs) across multiple feedback turns, comparing self-reflection against process-level feedback via a novel method called Research Gap Inference (RGI). Key findings: self-reflection yields negligible net improvement, one round of process-level feedback raises normalized scores by 8-15 points (~35-40% incorporation rate), but gains do not compound across turns as agents regress on up to 24% of previously satisfied criteria. The results suggest reliable multi-turn improvement remains out of reach for current DRA architectures, highlighting a fundamental limitation in iterative agentic research workflows.