technique

REAlignment Reward

techniqueactiveprovisionalrealignment-reward-eafd7dc0·1 events·first seen 16h ago

Aliases: REAlignment Reward

Co-occurring entities

REAR: Test-time Preference Realignment through Reward Decomposition

More like this (12)

Hybrid Reward Advantage Splitting REAR: Test-time Preference Realignment through Reward Decomposition post-training alignment DiT-Reward reward model CapReward Gradient-Guided Reward Optimization In-Context Reward Adaptation The Alignment Project AI alignment MedAlign rule-based reinforcement learning rewards

Recent events (1)

6arXiv · cs.CL·16h ago·source ↗

REAR: Test-time reward decomposition for preference realignment in LLMs

Researchers introduce REAR (REAlignment Reward), a training-free framework for aligning LLMs with diverse user preferences at test time. The method decomposes the reward function into question-related and preference-related components, then derives a realignment reward expressible as a linear combination of token-level log-probabilities. This formulation integrates cleanly with existing test-time scaling algorithms like best-of-N sampling and tree search, and experiments show it generalizes across preference alignment, math, and visual tasks.

Evaluation and Benchmarking Inference Economics REAlignment Reward REAR: Test-time Preference Realignment through Reward Decomposition +1 more