Almanac
product

AutoSkillHarm

productactiveprovisionalautoskillharm-30ae16bb·1 events·first seen 15d ago

Aliases: AutoSkillHarm

Co-occurring entities

More like this (12)

Recent events (1)

7arXiv · cs.CL·15d ago·source ↗

SkillHarm: Lifecycle-Aware Benchmark for Skill-Based Attacks on AI Agents

SkillHarm is a new benchmark evaluating adversarial attacks on AI agent skills across their full use lifecycle, covering two attack scenarios: Fixed-Payload Poisoning (FPP) and Self-Mutating Poisoning (SMP). The benchmark includes 879 attack samples across 71 skills, organized under a 12-category risk taxonomy targeting data pipelines, system environments, and agent autonomy. Experiments show current agents remain highly vulnerable, with attack success rates up to 86.3% (FPP) and 69.3% (SMP). An automated construction pipeline called AutoSkillHarm, driven by coding agents, was used to generate the benchmark at scale.