Entity · technique

Token-level Rationales

techniqueactivetoken-level-rationales-9519189e·1 events·first seen Jun 1, 2026

Aliases: Token-level Rationales

Co-occurring entities

Faithfulness Evaluation Plausibility Evaluation hate speech detection Soft Label Supervision

More like this (12)

RREDCoT: Segment-Level Reward Redistribution for Reasoning Models Hybrid Reward Advantage Splitting OneReason Technical Report bounding box symbolic rationales Token Budget Saturation and Mechanistic Early Detection of Reasoning Non-Convergence in Chain-of-Thought Models REAR: Test-time Preference Realignment through Reward Decomposition ReToken representation-level steering Expert Token Rank Equilibrium Reasoners (EqR)token credit assignment ReasonAlloc

Recent events (1)

5arXiv · cs.CL·Jun 1, 2026·source ↗

Disagreeing Rationales: Rethinking Classification and Explainability Evaluation in Hate Speech Detection

This paper investigates human disagreement in token-level rationale annotations for hate speech detection, a dimension less studied than label disagreement. The authors unify diverse models, training strategies, loss functions, and evaluation metrics under a single protocol, systematically comparing hard and soft label/rationale representation spaces. Results show that both hard and soft metrics favor softer representations, suggesting that soft supervision better captures human reasoning variation in subjective NLP tasks. The work calls for rethinking evaluation frameworks for classification and explainability in subjective NLP.

Evaluation and Benchmarking Alignment and RLHF Token-level Rationales Faithfulness Evaluation Plausibility Evaluation +2 more