Almanac
paper

Influcoder: Distilling Decoders' Gradient Influence Rankings into an Encoder for Data Attribution

paperactiveprovisionalinflucoder-distilling-decoders-gradient-influence-rankings-into-an-encoder-for-data-attribution-bee6034c·1 events·first seen 5d ago

Aliases: Influcoder: Distilling Decoders' Gradient Influence Rankings into an Encoder for Data Attribution

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·5d ago·source ↗

Influcoder: Distilling gradient influence rankings into an encoder for scalable data attribution

Influcoder is a proposed method for scalable data attribution in LLM training, distilling decoder-based gradient influence rankings into a compact encoder representation. The approach targets the practical bottleneck of influence function methods — their high computational cost and storage requirements — making them viable for large-scale dataset curation. The work is relevant to training data quality filtering and identifying sources of undesirable model behavior such as toxicity.