DeepRubric
deeprubric-998f055e·1 events·first seen 35h agoAliases: DeepRubric, DeepRubric-8B
Co-occurring entities
More like this (12)
Recent events (1)
DeepRubric: Evidence-tree rubric supervision cuts RL training cost for deep research agents by 13x
DeepRubric is a data construction framework that improves reinforcement learning efficiency for deep research agents by reversing the typical rubric-generation process: rather than inferring evaluation criteria from a query, it builds an evidence tree of verifiable sub-questions first, then synthesizes aligned query-rubric pairs. The authors construct 9K training examples and train DeepRubric-8B using rubric-based GRPO, achieving comparable performance to prior open-source state-of-the-art deep research models on three benchmarks while using roughly 13x fewer RL GPU-hours. The work addresses a key bottleneck in RL-based training of long-form research agents: unreliable reward signals from incomplete rubrics.