dataset
Wiktionary
datasetactiveprovisional
wiktionary-b9fd6a85·1 events·first seen 2d agoAliases: Wiktionary
Co-occurring entities
More like this (12)
Recent events (1)
G-IdiomAlign: Gloss-pivoted benchmark for cross-lingual idiom alignment in LLMs
Researchers introduce G-IdiomAlign, a benchmark anchoring idioms via English glosses from Wiktionary to evaluate cross-lingual idiom equivalence in LLMs. The benchmark supports two evaluation protocols: a multiple-choice task with typed distractors and a gloss-contrastive generation task isolating the effect of explicit semantic pivots. Experiments across diverse LLMs find that literal translation bias is the dominant failure mode, especially for low-resource languages, and that gloss conditioning improves performance but leaves substantial headroom. Mechanistic analysis on Qwen3-8B suggests cross-condition differences are concentrated in attention heads rather than layers.