technique
ReasonAlloc
techniqueactiveprovisional
reasonalloc-805dacf5·1 events·first seen 7d agoAliases: ReasonAlloc
Co-occurring entities
More like this (12)
Recent events (1)
ReasonAlloc: Hierarchical KV Cache Budget Allocation for Long-CoT Reasoning Models
ReasonAlloc is a training-free framework that reframes decoding-time KV cache compression as a hierarchical budget allocation problem, operating at both layer-wise (offline) and head-wise (online) levels. The method identifies an architecture-driven pattern called the 'Reasoning Wave' to guide layer preallocation, then dynamically reallocates to information-rich heads during decoding. Evaluated on MATH-500 and AIME 2024 using DeepSeek-R1-Distill and AceReason models, it outperforms uniform-budget baselines (R-KV, SnapKV, Pyramid-RKV) especially at small budgets of 128–512 tokens, with negligible overhead.