Language Identity Head Ablation
language-identity-head-ablation-03c97e02·1 events·first seen 4h agoAliases: Language Identity Head Ablation
Co-occurring entities
More like this (12)
Recent events (1)
LIHA reveals first-token broadcaster heads as mechanistic source of language identity in transformers
Researchers introduce Language Identity Head Ablation (LIHA), a causal intervention that zeros individual attention heads to measure language-switching behavior across 2,700 prompt-language pairs in seven languages. Applied to GPT-2, LIHA identifies a small set of 'first-token broadcaster' heads that propagate language identity signals throughout generation, with compensatory redistribution following a hierarchical, feedforward pattern. A controlled comparison between Qwen2.5-1.5B-Base and Qwen2.5-1.5B-Instruct provides direct causal evidence that instruction tuning reorganizes language identity circuits toward early-layer localization. The findings offer mechanistic grounding for why multilingual models generate in the wrong language and why this is difficult to correct.