Almanac
technique

semantic retrieval

techniqueactivesemantic-retrieval-7d8eec6f·1 events·first seen 29d ago

Aliases: semantic retrieval

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.CL·29d ago·source ↗

AMARIS: Memory-Augmented Rubric Improvement System for Rubric-Based Reinforcement Learning

AMARIS introduces a persistent evaluation memory system to improve rubric-based reward shaping in LLM fine-tuning via reinforcement learning. Unlike prior adaptive rubric methods that discard evaluation diagnostics after each step, AMARIS accumulates step-level summaries and retrieves relevant historical context via both static (recent steps) and dynamic (semantic similarity) retrieval to inform rubric updates. The system runs asynchronously alongside the RL training loop with approximately 5% time overhead. Experiments across closed and open-ended domains show consistent improvements over baselines, with ablations confirming that combining both retrieval modes yields the strongest results.