benchmark
SearchGEO
benchmarkactiveprovisional
searchgeo-0a027254·1 events·first seen 33h agoAliases: SearchGEO
Co-occurring entities
More like this (12)
Recent events (1)
SearchGEO framework measures LLM search agent vulnerability to web content manipulation
Researchers introduce SearchGEO, a controlled evaluation framework for measuring endorsement corruption in LLM-based web-search agents, combining a manipulation pipeline, five-mode attack taxonomy, and multiple output metrics. Evaluating 13 LLM backends on 308 cases each, they find attack success rates ranging from 0.0% on Claude-Sonnet-4.6 to 31.4% on Gemini-3-Flash, with model-family-specific vulnerability patterns. An auxiliary probe escalating endorsement to install commands reveals a behavioral split: Claude over-rejects while GPT over-trusts. The findings argue for treating adversarial search content robustness as a first-class safety evaluation dimension for deployed agents.