paper
On Subquadratic Architectures: From Applications to Principles
paperactiveprovisional
on-subquadratic-architectures-from-applications-to-principles-1d849fe9·1 events·first seen 6d agoAliases: On Subquadratic Architectures: From Applications to Principles
Co-occurring entities
More like this (12)
Braun et al. 2025 Compressed ComputationGoedel-ArchitectOpenSCAD Architectural 3D LLM BenchmarkSparse CircuitsWhich Models Are Our Models Built On? Auditing Invisible Dependencies in Modern LLMsContinual LLM Upcycling: A Predictor-Gated Bank-Wise Sparsity Training Recipe for Dense-to-Sparse LLMsCompressed Computation is (probably) not Computation in SuperpositionOperads for compositional reasoning in LLMsUniversal Approximation TheoremDense Supervision, Sparse Updates: On the Sparsity and Geometry of On-Policy DistillationSimons Workshop on Computational MathematicsFrom Correctness to Utility: Gain-Based Prefix Evaluation for LLM Reasoning
Recent events (1)
Comparative study finds xLSTM outperforms Mamba-2 and Gated DeltaNet on complex sequence tasks
A new arXiv paper compares three subquadratic sequence modeling architectures — xLSTM, Mamba-2, and Gated DeltaNet — across code model pre-training, LLM distillation, and time-series foundation model pre-training. xLSTM consistently delivers the strongest performance, which the authors attribute to more flexible and stable memory correction via its gating scheme. The paper provides a unified formulation and analysis of state tracking and memory dynamics across the three architectures, with corroborating results on synthetic length-generalization tasks.