technique
PCA (Principal Component Analysis)
techniqueactive
pca-principal-component-analysis--9d7ee3c6·1 events·first seen 1mo agoAliases: PCA (Principal Component Analysis)
Co-occurring entities
More like this (12)
Recent events (1)
What exactly does word2vec learn? A closed-form theory of representation learning dynamics
Researchers from BAIR present a new theoretical paper proving that word2vec's learning dynamics reduce, under mild approximations, to unweighted least-squares matrix factorization, with final representations given by PCA on a specific co-occurrence-derived matrix. The theory solves gradient flow dynamics in closed form, showing that embeddings learn one orthogonal linear subspace (concept) at a time in discrete, rank-incrementing steps. This provides a quantitative, predictive account of the linear representation hypothesis observed in word2vec and, by extension, offers a minimal theoretical foundation for understanding feature learning in modern LLMs.