technique
Rule-Based Rewards
techniqueactive
rule-based-rewards-4ff25ef0·1 events·first seen 28d agoAliases: Rule-Based Rewards
Co-occurring entities
More like this (12)
rule-based reinforcement learning rewardsrubric-based rewardsrubric-based reward shapingProcess Reward ModelReinforcement Learning with Verifiable RewardsReward Learning from ComparisonsConstrained Reinforcement Learningreward modelHierarchical Reinforcement LearningRubric RewardGoal-Conditioned Reinforcement LearningGradient-Guided Reward Optimization
Recent events (1)
Improving Model Safety Behavior with Rule-Based Rewards
OpenAI has developed a method called Rule-Based Rewards (RBRs) that trains models to behave safely without requiring extensive human data collection. The approach uses explicit rules to generate reward signals during training, offering a more scalable alternative to traditional RLHF-based safety alignment. This represents a practical contribution to alignment methodology from a Tier 1 lab.