Entity · benchmark

paired-scenario forced-choice probe

benchmarkactivepaired-scenario-forced-choice-probe-57b2a1ce·1 events·first seen May 25, 2026

Aliases: paired-scenario forced-choice probe

Co-occurring entities

Mistral AI Alibaba Mistral geopolitical bias Qwen 2.5-7B post-training alignment

More like this (12)

probing classifiers Reverse Probing Facet-Probe logistic regression probes Text-Only Probe mass-mean probing hidden state probing Unified Latent Probe TypeProbe few-shot prompting visual-token activation probing Chain-Text Probe

Recent events (1)

7arXiv · cs.AI·May 25, 2026·source ↗

Geopolitical Bias in LLMs Originates in Post-Training, Not Pre-Training Data

A study testing seven open-weight LLM pairs (base vs. chat models) across seven labs finds that geopolitical bias is introduced during post-training rather than inherited from pre-training data. Six of seven labs showed post-training shifts favoring the developer's home country or region, with Alibaba's Qwen 2.5 showing the most extreme shift (18x increase in China-favourability log-odds). The effect is also language-dependent: Mistral becomes pro-France only under French prompting. The authors argue this implicates alignment and RLHF processes as active shapers of geopolitical perspective, calling for greater transparency and auditing of post-training pipelines.

Evaluation and Benchmarking Open Weights Progress Mistral AI Alibaba Mistral +6 more