Almanac
dataset

UltraFeedback

datasetactiveprovisionalultrafeedback-f2b18910·1 events·first seen 6d ago

Aliases: UltraFeedback

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·6d ago·source ↗

Information-theoretic metric for measuring semantic progress in multi-turn dialogue

A new arXiv preprint formalizes 'semantic progress' in multi-turn dialogue as question-conditioned uncertainty reduction and introduces an information-theoretic metric approximated in embedding space using a Gaussian formulation with closed-form updates. The metric has desirable theoretical properties (monotonicity, additive decomposition, diminishing returns) and requires no autoregressive inference at evaluation time, making it reproducible and lightweight. Experiments on MT-Bench, Chatbot Arena, and UltraFeedback show competitive or improved agreement with human judgments compared to several LLM-as-a-judge baselines. The approach works with lightweight embedding models under CPU-only execution.