paper

LMs as Task-Specific Knowledge Bases: An Interpretability Analysis

paperactiveprovisionallms-as-task-specific-knowledge-bases-an-interpretability-analysis-2fc54854·1 events·first seen 3d ago

Aliases: LMs as Task-Specific Knowledge Bases: An Interpretability Analysis

More like this (12)

Can LLMs Judge Better Than They Generate? Evaluating Task Asymmetry, Mechanistic Interpretability and Transferability for In-Context QA Knowledge-Graph Grounding Helps LLMs Only for Out-of-Training Knowledge: A Controlled Study on Clinical Question Answering semi-structured knowledge bases Cross-Lingual Exploration for Parametric Knowledge Cognitive Episodes in LLM Reasoning Traces Enable Interpretable Human Item Difficulty Prediction SIMMER: Benchmarking Latent Failures in LLM Executable Planning with a World Model A Training-Free Mixture-of-Agents Framework for Multi-Document Summarization using LLMs and Knowledge Graphs interpretability The Masked Advantage: Uncovering Local-Language Access to Cultural Knowledge in LLMs Causally Evaluating the Learnability of Formal Language Tasks Does VLA Even Know the Basics? Measuring Commonsense and World Knowledge Retention in Vision-Language-Action Models Which Models Are Our Models Built On? Auditing Invisible Dependencies in Modern LLMs

Recent events (1)

6arXiv · cs.CL·3d ago·source ↗

LMs encode knowledge in task-specific parameter subsets, undermining the knowledge-base analogy

A new arXiv paper investigates whether language models satisfy the consistency property of knowledge bases — that the same fact returns consistent results regardless of query form. Behavioral and mechanistic analyses reveal that LMs encode knowledge in a task-specific manner: facts acquired on one task frequently fail to transfer to others during training, and distinct parameter subsets underlie the same fact across different tasks. The authors also show that chain-of-thought reasoning derives part of its effectiveness by engaging task-specific parameters beyond those tied to the evaluation task, with implications for factual reliability and model controllability.

Evaluation and Benchmarking AI Safety Research LMs as Task-Specific Knowledge Bases: An Interpretability Analysis