paper
Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings
paperactiveprovisional
your-unembedding-matrix-is-secretly-a-feature-lens-for-text-embeddings-1ccbf43d·1 events·first seen 9d agoAliases: Your UnEmbedding Matrix is Secretly a Feature Lens for Text Embeddings
Co-occurring entities
More like this (12)
Massive Text Embedding BenchmarkSparse Embedding ModelsInstructional Segment Embeddingsembedding modelssentence embeddingsFeature Auto-EncoderOpenAI Embeddings APIUnstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse AutoencodersLCO-EmbeddingJoint Embedding Predictive Architecture (JEPA)static embeddingsSparse Autoencoders
Recent events (1)
EmbedFilter: Using the unembedding matrix to suppress high-frequency token noise in LLM text embeddings
Researchers identify that LLM text embeddings over-express high-frequency but semantically uninformative tokens when projected onto vocabulary space, degrading embedding quality. They introduce EmbedFilter, a simple linear transformation that filters out the subspace of the unembedding matrix responsible for writing these tokens into embedding space. The method improves zero-shot performance on text embedding benchmarks across multiple LLM backbones and yields a byproduct of dimensionality reduction without quality loss. Code is publicly released.