Almanac
paper

AMEL

paperactiveamel-0241e494·1 events·first seen 25d ago

Aliases: AMEL

Co-occurring entities

More like this (12)

Recent events (1)

7arXiv · cs.CL·25d ago·source ↗

AMEL: Accumulated Message Effects Bias LLM Judgments in Multi-Turn Evaluation Pipelines

This paper introduces AMEL (Accumulated Message Effect on LLM Judgments), documenting that prior conversation history with predominantly positive or negative evaluations systematically biases subsequent LLM judgments toward the prevailing polarity. Across 75,898 API calls to 11 models from 4 providers, the effect is statistically robust (d = -0.17, p < 10^-46), concentrates on high-uncertainty items, and shows a negativity asymmetry where negative histories induce 1.62x more bias than positive ones. Critically, the bias does not grow with context length, scaling reduces but does not eliminate it, and the simplest mitigation is using a fresh context per evaluation item.