Almanac
paper

On Subquadratic Architectures: From Applications to Principles

paperactiveprovisionalon-subquadratic-architectures-from-applications-to-principles-1d849fe9·1 events·first seen 6d ago

Aliases: On Subquadratic Architectures: From Applications to Principles

Co-occurring entities

More like this (12)

Recent events (1)

5arXiv · cs.LG·6d ago·source ↗

Comparative study finds xLSTM outperforms Mamba-2 and Gated DeltaNet on complex sequence tasks

A new arXiv paper compares three subquadratic sequence modeling architectures — xLSTM, Mamba-2, and Gated DeltaNet — across code model pre-training, LLM distillation, and time-series foundation model pre-training. xLSTM consistently delivers the strongest performance, which the authors attribute to more flexible and stable memory correction via its gating scheme. The paper provides a unified formulation and analysis of state tracking and memory dynamics across the three architectures, with corroborating results on synthetic length-generalization tasks.