Almanac
paper

Neuron Populations Exhibit Divergent Selectivity with Scale

paperactiveprovisionalneuron-populations-exhibit-divergent-selectivity-with-scale-b05d83b0·1 events·first seen 14d ago

Aliases: Neuron Populations Exhibit Divergent Selectivity with Scale

Co-occurring entities

More like this (12)

Recent events (1)

6arXiv · cs.LG·14d ago·source ↗

Rosetta Neurons follow sublinear power-law scaling with model size, becoming more monosemantic at scale

A new arXiv paper investigates how neuron populations evolve with scale in both language models (up to 30B parameters) and vision models (up to 5B parameters), focusing on 'Rosetta Neurons' — neurons with similar activation patterns across independently trained models. The authors find Rosetta Neurons grow in absolute count but shrink as a fraction of total neurons, and exhibit a 'Neuron Polarization Effect' where they become increasingly monosemantic while non-Rosetta neurons remain less selective. An analytical model explains the sublinear power-law scaling, and the paper demonstrates practical utility via a targeted data-filtering case study for continued pretraining. The results extend scaling laws to neuron-level interpretability structure, linking model size to systematic changes in universality and specialization.