Almanac
technique

Distributed Agent Attack

techniqueactiveprovisionaldistributed-agent-attack-705ccdd6·1 events·first seen 16d ago

Aliases: Distributed Agent Attack

Co-occurring entities

More like this (12)

Recent events (1)

7arXiv · cs.AI·16d ago·source ↗

Stateful Online Monitoring Catches Distributed Agent Attacks via Cross-Account Clustering

Researchers demonstrate the first known distributed agent attack, a multi-agent scaffold that splits harmful cybersecurity tasks across many user accounts to evade per-transcript safety monitors, reducing detection rates to roughly one-fifth of standard attacks. As a defense, they develop a stateful online monitor that clusters weak suspiciousness signals across many agent transcripts in real time, escalating only rarely to a full LM-based review. In large-scale simulated datacenter traffic evaluations, the monitor Pareto-dominates standard monitors by catching distributed attacks 30% earlier with negligible latency overhead for ~99% of traffic. The system also incidentally catches standard jailbreaks, since adaptive attackers tend to reuse attack variants across accounts.