Almanac
benchmark

Helpfulness Consistency

benchmarkactivehelpfulness-consistency-bc3f86ba·1 events·first seen 26d ago

Aliases: Helpfulness Consistency

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.AI·26d ago·source ↗

Political Consistency Training: Reducing Covert Political Bias in LLMs via RL

Researchers identify a phenomenon called 'covert political bias' in LLMs, where models handle politically paired topics asymmetrically across 7 identified technique categories. They propose two metrics—Sentiment Consistency and Helpfulness Consistency—to measure this asymmetry. To address it, they introduce Political Consistency Training (PCT), an RL-based method with complementary training paradigms that reduces covert bias while preserving overall helpfulness and generalizing to held-out benchmarks.