Almanac
technique

Attention Sorting

techniqueactiveprovisionalattention-sorting-4e5c6776·1 events·first seen 41h ago

Aliases: Attention Sorting

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·41h ago·source ↗

Debiased One-Pass Attention Sorting fails to close gap with iterative sorting for long-context LLMs

A new arXiv preprint investigates whether position bias is the primary bottleneck in long-context LLM performance, proposing Debiased One-Pass Attention Sorting as a cheaper alternative to iterative Attention Sorting. Experiments on LLaMA-2-7B-32K-Instruct and YaRN-Llama-2-7b-64k show that bias correction alone is insufficient: on one model it provides no improvement over uncalibrated single-pass sorting, and on the other it closes only 37% of the gap to iterative sorting. The findings suggest that iterative reordering provides benefits beyond position-bias correction, leaving the efficiency-accuracy tradeoff unresolved.