Almanac
paper

From Self to Other: Evaluating Demographic Perspective-Taking in LLM Hate Speech Annotation

paperactiveprovisionalfrom-self-to-other-evaluating-demographic-perspective-taking-in-llm-hate-speech-annotation-8e20d153·1 events·first seen 11d ago

Aliases: From Self to Other: Evaluating Demographic Perspective-Taking in LLM Hate Speech Annotation

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.CL·11d ago·source ↗

LLMs fail to consistently simulate demographic perspective-taking in hate speech annotation

A new arXiv paper evaluates whether persona-conditioned LLMs can replicate how different demographic groups perceive hate speech, testing three dimensions: inter-group disagreement, in-group sensitivity, and vicarious prediction. No model consistently captures all three dimensions, and performance is highly model-dependent rather than emerging reliably from identity prompts alone. Vicarious prompting with Llama 3.1 provides the closest approximation to human disagreement patterns across demographic axes. The findings have implications for using LLMs as proxies for diverse human annotators in content moderation tasks.