product
Skill-RM
productactiveprovisional
skill-rm-806b8684·1 events·first seen 14d agoAliases: Skill-RM
Co-occurring entities
More like this (12)
Recent events (1)
Skill-RM: A unified reward model framework treating evaluation as an agentic skill
Researchers from the Qwen team propose Skill-RM, a framework that reformulates reward modeling as the execution of a reusable 'Reward-Evaluation Skill,' enabling a single model to orchestrate heterogeneous evaluation criteria including rule-based verifiers, ground-truth references, and rubrics. By treating reward computation as a structured agentic task, Skill-RM dynamically selects and aggregates evidence per input rather than relying on static evaluation. Experiments on reward benchmarks and downstream tasks (best-of-N selection, RL) show consistent improvements over traditional judge baselines. The code is publicly released under the Qwen-Applications GitHub organization.