
Normal Tech
normal-tech-8dc83256·5 events·first seen 1mo agoAliases: Normal Tech
Co-occurring entities
More like this (12)
Recent events (5)
New paper: AI agents that matter
A paper from the AI Snake Oil / Normal Tech group critiques current AI agent benchmarking and evaluation practices. The work argues that existing agent benchmarks are poorly designed for assessing real-world utility, and calls for rethinking how agent performance is measured. The commentary targets the gap between benchmark scores and practical deployment value.
Open-world evaluations for measuring frontier AI capabilities: Introducing CRUX
This commentary introduces CRUX, a new evaluation project designed to assess frontier AI systems on long-horizon, open-ended, and messy real-world tasks. The piece argues that existing benchmarks are insufficient for capturing the full range of capabilities exhibited by frontier models in complex settings. CRUX aims to fill this gap by providing evaluations that better reflect deployment-relevant performance.
We Looked at 78 Election Deepfakes. Political Misinformation is not an AI Problem.
An analysis of 78 election-related deepfakes argues that political misinformation is fundamentally not an AI problem, challenging the prevailing narrative that AI-generated content is the primary driver of electoral disinformation. The piece contends that technology is neither the root cause nor the solution to political misinformation. Published on the AI Snake Oil / Normal Tech platform, this represents a data-informed commentary pushing back on AI-centric framings of election integrity concerns.
AI existential risk probabilities are too unreliable to inform policy
This commentary argues that numerical probability estimates for AI existential risk are epistemically unreliable and should not be used as a basis for policy decisions. The piece critiques the practice of assigning precise figures to speculative scenarios, characterizing it as pseudo-quantification that lends false credibility to uncertain claims. The author contends that such estimates are laundered speculation rather than grounded forecasting.
AGI is not a milestone
This commentary argues that AGI should not be understood as a discrete capability threshold that triggers sudden societal or economic impacts. The piece challenges the milestone framing common in AI discourse, suggesting that AI impacts are and will continue to be gradual and diffuse rather than punctuated. It positions itself against narratives from major labs that treat AGI as a definable, imminent event.