When Certainty Is an Artifact: Keyword Lexicon Blindness and the (Mis)Measurement of Rhetorical Stance
when-certainty-is-an-artifact-keyword-lexicon-blindness-and-the-mis-measurement-of-rhetorical-stance-c31ee05c·1 events·first seen 4d agoAliases: When Certainty Is an Artifact: Keyword Lexicon Blindness and the (Mis)Measurement of Rhetorical Stance
More like this (12)
Recent events (1)
LLM-based classification exposes keyword lexicon artifacts in computational social science stance measurement
A new arXiv preprint demonstrates that statistically significant findings in computational social science can be entirely measurement artifacts of keyword-based scoring instruments. Analyzing 85 interviews across four public intellectuals, the authors show that keyword-based certainty scores produce strong correlations (r=0.72–0.93) that collapse or invert when replaced with LLM zero-shot semantic classification on 32,625 sentences. The paper identifies three structural failure modes in keyword lexicons—syntactic blindness, polysemy blindness, and categorical absence—and argues that keyword counts measure lexical co-occurrence tendencies rather than rhetorical stance. The work has implications for the validity of prior NLP-based social science research and for the comparative utility of LLMs as measurement instruments.