person
Elia Cunegatti
personactiveprovisional
elia-cunegatti-1c603c73·1 events·first seen 15d agoAliases: Elia Cunegatti
Co-occurring entities
More like this (12)
Recent events (1)
SubFit: Submodule-Level Fitted Residual Replacement for LLM Compression
SubFit introduces a post-training LLM compression method that operates at the submodule level (Attention and FeedForward separately) rather than full layers, and selects components non-contiguously. The approach replaces removed submodules with lightweight fitted residual bypasses calibrated on small data. Evaluated across ten LLMs at sparsity levels from 12.5% to 37.5%, SubFit retains 84.6% of dense downstream accuracy at 25% sparsity versus 81.6% for the strongest baseline, while reducing perplexity degradation from 4.34x to 2.42x and delivering measurable inference speedup and KV-cache savings.