Entity · technique

reinforcement learning with belief-state rewards

techniqueactivereinforcement-learning-with-belief-state-rewards-d5939c34·1 events·first seen May 29, 2026

Aliases: reinforcement learning with belief-state rewards

Co-occurring entities

Contextual Belief Management (CBM)BeliefTrack representation-level steering Zhejiang University NLP Group (ZJUNLP)

More like this (12)

rule-based reinforcement learning rewards Reinforcement Learning with Metacognitive Feedback reinforcement learning from verifier feedback shielded reinforcement learning Using Reward Uncertainty to Induce Diverse Behaviour in Reinforcement Learning Improving LLM-Generated Process Model Quality Through Reinforcement Learning: The Role of Reward Function Design sim-to-real reinforcement learning Physics-EnhAnced Reinforcement Learning curiosity-driven reinforcement learning Reinforcement Learning for Code Reinforcement Learning with Verifiable Rewards Entropy-Regularized Reinforcement Learning

Recent events (1)

6arXiv · cs.CL·May 29, 2026·source ↗

BeliefTrack: Benchmarking and Improving Contextual Belief Management in LLMs

This paper introduces Contextual Belief Management (CBM) as a framework for studying how LLMs should update, preserve, or ignore information across long-horizon interactions. The authors release BeliefTrack, a closed-world benchmark with symbolic verifiers enabling exact turn-level evaluation across Rule Discovery and Circuit Diagnosis tasks. Vanilla LLMs show severe CBM failures; reinforcement learning with belief-state rewards reduces failure rates by 70.9% on average, while representation-level steering achieves 46.1% reduction. Probing experiments reveal latent belief-state dynamics underlying these failures.

Evaluation and Benchmarking Agent and Tool Ecosystem reinforcement learning with belief-state rewards Contextual Belief Management (CBM)BeliefTrack +3 more