benchmark
paired-scenario forced-choice probe
benchmarkactiveprovisional
paired-scenario-forced-choice-probe-57b2a1ce·1 events·first seen 22d agoAliases: paired-scenario forced-choice probe
Co-occurring entities
More like this (12)
Recent events (1)
Geopolitical Bias in LLMs Originates in Post-Training, Not Pre-Training Data
A study testing seven open-weight LLM pairs (base vs. chat models) across seven labs finds that geopolitical bias is introduced during post-training rather than inherited from pre-training data. Six of seven labs showed post-training shifts favoring the developer's home country or region, with Alibaba's Qwen 2.5 showing the most extreme shift (18x increase in China-favourability log-odds). The effect is also language-dependent: Mistral becomes pro-France only under French prompting. The authors argue this implicates alignment and RLHF processes as active shapers of geopolitical perspective, calling for greater transparency and auditing of post-training pipelines.