technique
Faith-Shap
techniqueactive
faith-shap-8247b312·1 events·first seen 1mo agoAliases: Faith-Shap
Co-occurring entities
More like this (12)
Recent events (1)
SPEX and ProxySPEX: Scalable Interaction Discovery for LLM Interpretability
Researchers from BAIR introduce SPEX (Spectral Explainer) and ProxySPEX, algorithms for identifying influential feature, data, and model-component interactions in LLMs at scale. The approach exploits sparsity, low-degreeness, and hierarchy properties to reframe interaction discovery as a sparse recovery problem using tools from signal processing and coding theory. ProxySPEX achieves comparable performance to SPEX with roughly 10x fewer ablations by leveraging hierarchical structure. The methods are evaluated on feature attribution (sentiment analysis), data attribution, and mechanistic interpretability tasks, outperforming marginal methods like LIME at long context lengths.