Almanac
product

Influcoder

productactiveprovisionalinflucoder-69237961·1 events·first seen 5d ago

Aliases: Influcoder

Co-occurring entities

More like this (12)

Recent events (1)

4arXiv · cs.CL·5d ago·source ↗

Influcoder: Distilling gradient influence rankings into an encoder for scalable data attribution

Influcoder is a proposed method for scalable data attribution in LLM training, distilling decoder-based gradient influence rankings into a compact encoder representation. The approach targets the practical bottleneck of influence function methods — their high computational cost and storage requirements — making them viable for large-scale dataset curation. The work is relevant to training data quality filtering and identifying sources of undesirable model behavior such as toxicity.