technique
Debiased One-Pass Attention Sorting
techniqueactiveprovisional
debiased-one-pass-attention-sorting-13bb211f·1 events·first seen 41h agoAliases: Debiased One-Pass Attention Sorting
Co-occurring entities
More like this (12)
Position Bias Correction is Insufficient for One-Pass Attention SortingAttention SortingProbSparse AttentionBlock Sparse AttentionDifferential AttentionDOA (Decoder-Only Attention)bidirectional attentionsparse attentionSliding Window AttentionGraph Attention NetworkCross-Layer Sparse AttentionMiniMax Sparse Attention
Recent events (1)
Debiased One-Pass Attention Sorting fails to close gap with iterative sorting for long-context LLMs
A new arXiv preprint investigates whether position bias is the primary bottleneck in long-context LLM performance, proposing Debiased One-Pass Attention Sorting as a cheaper alternative to iterative Attention Sorting. Experiments on LLaMA-2-7B-32K-Instruct and YaRN-Llama-2-7b-64k show that bias correction alone is insufficient: on one model it provides no improvement over uncalibrated single-pass sorting, and on the other it closes only 37% of the gap to iterative sorting. The findings suggest that iterative reordering provides benefits beyond position-bias correction, leaving the efficiency-accuracy tradeoff unresolved.