paper

NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation

paperactiveprovisionalnll-guided-full-attention-layer-selection-for-training-free-sliding-window-adaptation-b7e48c24·1 events·first seen 37h ago

Aliases: NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation

Co-occurring entities

LongMemEval Qwen3-4B LightTransfer

More like this (12)

Sliding Window Attention Layer-Adaptive Expert Pruning Cross-Layer Sparse Attention LoRA (Low-Rank Adaptation)Beyond Fully Random Masking: Attention-Guided Denoising and Optimization for Diffusion Language Models ExpRL: Exploratory RL for LLM Mid-Training Hierarchical Advantage Weighting for Online RL Fine-Tuning of VLAs from Sparse Episode Outcomes Data Selection Through Iterative Self-Filtering for Vision-Language Settings Observe-and-Act Adaptive Context Selection MiniMax Sparse Attention Paved with True Intents: Intent-Aware Training Improves LLM Safety Classification Across Training Regimes Multi-head Latent Attention (MLA)

Recent events (1)

5arXiv · cs.CL·37h ago·source ↗

NLL-guided training-free method selects optimal full-attention layers for efficient long-context inference

Researchers propose NLL-guided layer selection, a training-free technique for hybrid attention models that identifies which layers should use full versus sliding-window attention by measuring negative log-likelihood degradation on answer tokens. On LongMemEval with Qwen3-4B, the method achieves 64.6% accuracy using only 1/4 full-attention layers, matching a 1/2-FA periodic baseline while halving compute, and outperforming a periodic 1/4-FA baseline by 10.4 percentage points. The calibration procedure requires approximately 15 minutes of one-time compute, making it practical for deployment. The work advances the efficiency-accuracy tradeoff for long-context LLM inference without requiring any retraining.

Long Context Evolution Inference Economics LongMemEval Qwen3-4B NLL-Guided Full-Attention Layer Selection for Training-Free Sliding-Window Adaptation +1 more