4arXiv cs.AI (Artificial Intelligence)·5d ago

LEAF-X: Entropy-guided explainability framework for transformer-based ASR models

Researchers introduce LEAF-X (Listening with Entropy-guided Attention for Faithful explainability), a model-intrinsic XAI framework for transformer-based automatic speech recognition systems like Whisper. The method combines entropy-guided attention weighting, multi-layer attention rollout, and optional causal ablations to produce sparse token-to-frame attributions. Evaluations show 32% improved faithfulness and 35-39% stronger locality/sparsity compared to perturbation-based explainers and raw attention maps, enabling more auditable ASR.

AI Safety Research Listening with Attention: Entropy-Guided Explainability for Transformer-Based Audio Models LEAF-X Whisper

Related guides (1)

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Related events (8)

3arXiv · cs.LG·11d ago·source ↗

LLM-augmented XAI framework with mutual feature interactions for network operations

A new arXiv paper proposes a framework combining LLMs with SHAP-based explainability, augmented by mutual feature interaction data, to generate natural language explanations for AI/ML models used in network operations. The approach is validated on an optical quality-of-transmission estimation task with human evaluators, showing 12.2% and 6.2% improvements in explanation usefulness and scope over a SHAP-only baseline, with 97.5% correctness. The work targets the gap between technical XAI outputs and actionable insights for non-specialist network operators.

Evaluation and Benchmarking Generative Explainability for Next-Generation Networks: LLM-Augmented XAI with Mutual Feature Interactions SHapley Additive exPlanations Generative Explainability for Next-Generation Networks: LLM-Augmented XAI with Mutual Feature Interactions

4Hugging Face Blog·1mo ago·source ↗

Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers

This Hugging Face blog post provides a practical guide for fine-tuning OpenAI's Whisper model for multilingual automatic speech recognition using the Transformers library. It covers dataset preparation, training configuration, and evaluation using the Word Error Rate metric. The post targets practitioners seeking to adapt Whisper to low-resource or domain-specific languages.

Open Weights Progress Agent and Tool Ecosystem Hugging Face Transformers Hugging Face Word Error Rate +2 more

5arXiv · cs.AI·11d ago·source ↗

Explainability pipeline reveals divergent cues used by deepfake speech detectors

Researchers propose an audio-native explainability pipeline using Integrated Gradients on time-aligned self-supervised representations to localize decision evidence in deepfake speech detectors. Applied to three WavLM-based detectors (AASIST, CA-MHFA, SLS) on the ASVspoof 5 benchmark, the method reveals that despite similar performance, each detector relies on fundamentally different cues: environmental noise, phoneme artifacts, and word boundaries respectively. Findings are validated via causal masking experiments that confirm performance degrades when primary cues are removed. The work advances interpretability of audio deepfake detection, relevant to AI safety and media authenticity.

Evaluation and Benchmarking AI Safety Research CA-MHFA Integrated Gradients SLS +4 more

8Openai Blog·1mo ago·source ↗

Introducing Whisper

OpenAI introduced Whisper, an open-source automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. The model demonstrates strong robustness to accents, background noise, and technical language, approaching human-level accuracy in English transcription. Whisper supports transcription in multiple languages as well as translation to English, and the weights and inference code were released publicly.

Open Weights Progress Agent and Tool Ecosystem OpenAI Whisper +1 more

6Berkeley Ai Research (Bair) Blog·1mo ago·source ↗

SPEX and ProxySPEX: Scalable Interaction Discovery for LLM Interpretability

Researchers from BAIR introduce SPEX (Spectral Explainer) and ProxySPEX, algorithms for identifying influential feature, data, and model-component interactions in LLMs at scale. The approach exploits sparsity, low-degreeness, and hierarchy properties to reframe interaction discovery as a sparse recovery problem using tools from signal processing and coding theory. ProxySPEX achieves comparable performance to SPEX with roughly 10x fewer ablations by leveraging hierarchical structure. The methods are evaluated on feature attribution (sentiment analysis), data attribution, and mechanistic interpretability tasks, outperforming marginal methods like LIME at long context lengths.

Long Context Evolution Evaluation and Benchmarking GPT-4o mini Faith-Shap LIME +5 more

3arXiv · cs.CL·5d ago·source ↗

Continual learning approach for disfluency-aware ASR with explicit disfluency tokens

A new arXiv preprint addresses the challenge of transcribing disfluent speech (hesitations, repetitions, fillers) in ASR systems, which typically omit such markers causing information loss. The authors introduce explicit disfluency tokens into a pretrained ASR model and apply continual learning to adapt across datasets with varying disfluency distributions while mitigating catastrophic forgetting. The work identifies a trade-off between disfluency marker learning and general ASR performance, and finds a consistent cross-attention head mechanism shared across continual learning methods.

Learning to Hear Hesitation: Continual Learning for Disfluency-Aware ASR

4arXiv · cs.CL·9d ago·source ↗

Zero-shot LLMs fail to beat baselines on stock prediction; explainability signals retain practical value

A new arXiv preprint evaluates zero-shot NLP pipelines for predicting short-term stock movements from financial news, finding that across multiple models and prediction horizons, zero-shot approaches consistently fail to outperform simple baselines, with especially weak performance on negative price movements. The authors introduce a multi-layered explainability framework linking predictions to token-, article-, and aggregate-level evidence, finding that explainability signals can reliably distinguish trustworthy from unreliable predictions even when accuracy is low. The work argues for a shift toward decision-support systems emphasizing transparency and uncertainty awareness rather than raw predictive accuracy.

Evaluation and Benchmarking Can News Predict the Market? Limits of Zero-Shot Financial NLP and the Role of Explainable AI

5arXiv · cs.CL·1mo ago·source ↗

Conditional Scale Entropy: A Wavelet-Derived Tool for Mechanistic Interpretability of Metaphor Processing in Transformers

This paper introduces Conditional Scale Entropy (CSE), a wavelet-derived measure of how transformer computation engages across frequency scales at each layer, and applies it to study metaphor processing in decoder-only language models. The authors prove CSE is invariant to update magnitude, isolating structural computation patterns from intensity. Across architectures ranging from GPT-2 (124M) to LLaMA-2 7B and GPT-oss 20B, metaphorical tokens consistently produce higher spectral breadth than literal tokens in early-to-mid layers, with the effect surviving permutation correction and specificity controls. The work establishes multi-scale coordination as a consistent mechanistic signature of metaphorical language processing and positions CSE as a general interpretability tool for cross-depth structure in transformers.

Evaluation and Benchmarking AI Safety Research Conditional Scale Entropy mechanistic interpretability GPT-2 +3 more