Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals
will-the-agent-recuse-itself-measuring-llm-agent-compliance-with-in-band-access-deny-signals-7299f6cf·1 events·first seen 12d agoAliases: Will the Agent Recuse Itself? Measuring LLM-Agent Compliance with In-Band Access-Deny Signals
Co-occurring entities
More like this (12)
Recent events (1)
Recuse Signal: In-band access-deny standard for LLM agents shows 100% compliance in pilot
Researchers propose and empirically test a lightweight 'Recuse Signal' — a cooperative, in-band deny mechanism analogous to robots.txt — that servers can emit over existing protocol channels (SSH banners, PostgreSQL NOTICEs) to ask autonomous LLM agents to voluntarily withdraw. A controlled pilot using GPT-4o, GPT-4o-mini, and Claude Code found 100% recusal when the signal was present versus 100% task completion in controls, though the signal behaved cooperatively rather than absolutely: explicit operator-authorization framing caused the most capable model to override the signal. The work defines an open mini-standard, releases two low-footprint adapters, and frames the mechanism as a governance control rather than a security boundary.