Field Order Should Not Matter: Permutation-Invariant Embedding Model Fine-Tuning for Structured Metadata Retrieval
field-order-should-not-matter-permutation-invariant-embedding-model-fine-tuning-for-structured-metadata-retrieval-2412bf14·1 events·first seen 14h agoAliases: Field Order Should Not Matter: Permutation-Invariant Embedding Model Fine-Tuning for Structured Metadata Retrieval
Co-occurring entities
More like this (12)
Recent events (1)
Permutation-Invariant Fine-Tuning (PI-FT) eliminates field-order sensitivity in structured metadata retrieval
Researchers identify that fine-tuned text encoders for structured metadata retrieval silently overfit to field serialization order, losing 7.4 nDCG@10 points when field order changes at index time. They propose PI-FT, a two-line data-loader change that randomizes field order and applies random field dropout during fine-tuning, reducing the order-change penalty to 0.2 points. The paper also introduces DevDataBench, a fully LLM-generated multilingual benchmark covering ~10,000 development statistics indicators across 15 languages, and shows a fine-tuned 118M-parameter CPU encoder outperforms zero-shot text-embedding-3-large (0.707 vs. 0.556 nDCG@10) with strong gains in low-resource languages.