paper
Influcoder: Distilling Decoders' Gradient Influence Rankings into an Encoder for Data Attribution
paperactiveprovisional
influcoder-distilling-decoders-gradient-influence-rankings-into-an-encoder-for-data-attribution-bee6034c·1 events·first seen 5d agoAliases: Influcoder: Distilling Decoders' Gradient Influence Rankings into an Encoder for Data Attribution
Co-occurring entities
More like this (12)
Graph Neural Network EncoderIntegrated GradientsGradient-Guided Reward OptimizationInflucoderFeature Auto-EncoderUnstable Features, Reproducible Subspaces: Understanding Seed Dependence in Sparse AutoencodersSparse AutoencodersBeyond Fully Random Masking: Attention-Guided Denoising and Optimization for Diffusion Language Modelsgradient accumulationAnatomy of Post-Training: Using Interpretability to Characterize Data and Shape the Learning SignalVAE Encoderembedding model leaderboard
Recent events (1)
Influcoder: Distilling gradient influence rankings into an encoder for scalable data attribution
Influcoder is a proposed method for scalable data attribution in LLM training, distilling decoder-based gradient influence rankings into a compact encoder representation. The approach targets the practical bottleneck of influence function methods — their high computational cost and storage requirements — making them viable for large-scale dataset curation. The work is relevant to training data quality filtering and identifying sources of undesirable model behavior such as toxicity.