Natural Language Inference
natural-language-inference-446a8e30·2 events·first seen 22d agoAliases: Natural Language Inference
Co-occurring entities
More like this (12)
Recent events (2)
Signal Collapse and Reward Hacking in Checker-Guided RAG for Biomedical QA
This paper investigates why NLI-based claim checkers used as process rewards in RL-trained medical RAG agents succeed or fail during training. The authors find that a checker's output distribution during training—not its held-out accuracy—determines whether it provides useful gradient signal, with LLM log-probability scoring causing near-total signal collapse (97%+ neutral labels) while a calibrated MedNLI classifier avoids this. A key finding is that stronger checkers can trigger reward hacking cascades (ultra-short answers, search avoidance, language collapse), while moderate-signal local classifiers yield better final model quality (+12% BERTScore over zero-shot). The work frames these as boundary conditions for verifier-as-reward systems in RLVR pipelines.
Cross-Annotator Preference Optimization (CAPO) for Learning Annotator-Specific Explanation Behavior
This paper investigates whether LLMs can learn and reproduce individual annotator-specific reasoning patterns, not just label choices, using two sentence-pair tasks (NLI and paraphrase judgment) with four annotators each. The authors find that annotator-specific patterns are weak at the single-annotation level but detectable after aggregation, and propose CAPO—a preference optimization method that contrasts a target annotator's response against other valid but less target-specific annotations. CAPO outperforms prompting and supervised fine-tuning baselines in capturing annotator-specific label-explanation behavior. The work suggests a path toward scalable annotation pipelines grounded in annotator histories rather than labels alone.