Entity · technique

L0 regularization

techniqueactivel0-regularization-4edced4f·1 events·first seen Jun 17, 2026

Aliases: L0 regularization

Co-occurring entities

More like this (12)

KL-Cov regularization Entropy Regularization Cross-sample Consistency Regularization Target Distribution Regularization KL-regularized RL R-Drop consistency regularization Entropy-Regularized Reinforcement Learning LoRA (Low-Rank Adaptation)PALS: Percentile-Aware Layerwise Sparsity for LLM Pruning Beyond Negative-Ridge Endpoints: Mixed-Sign Spectral Regularization via Negative-Shifted Gradient Descent Continual LLM Upcycling: A Predictor-Gated Bank-Wise Sparsity Training Recipe for Dense-to-Sparse LLMs Beyond the Hard Budget: Sparsity Regularizers for More Interpretable Top-k Sparse Autoencoders

Recent events (1)

5arXiv · cs.CL·Jun 17, 2026·source ↗

ConSA: Learned FA/SWA allocation for efficient hybrid attention in LLMs

ConSA is a framework that learns optimal assignments between full attention and sliding-window attention layers under a user-specified sparsity target, using L0 regularization and augmented Lagrangian constraints. Evaluated on 0.6B and 1.7B parameter models, learned allocations consistently outperform hand-crafted rule-based baselines, with KV-head-wise granularity outperforming layer-wise. A consistent structural pattern emerges: SWA concentrates in bottom layers while FA clusters in contiguous middle-layer blocks, diverging from the evenly interleaved patterns used in existing hybrid architectures.

Long Context Evolution Inference Economics L0 regularization ConSA