Entity · company

Glasswing

companyactiveprovisionalglasswing-8ed78a4c·1 events·first seen 23h ago

Aliases: Glasswing

Co-occurring entities

More like this (12)

Project Glasswing Waterbirds Gray Swan GyroSwin PipeDream-2BW Spider Skybridge SkyPilot DRIFT Snowflake Horizon Catalog SwiGLU Windsurf

Recent events (1)

7Anthropic News·23h ago·source ↗

Anthropic details Fable 5 cybersecurity safeguards and proposes AI jailbreak severity framework

Anthropic has re-deployed Claude Fable 5 globally and published detailed documentation of its cybersecurity safety classifiers, which categorize uses into prohibited, high-risk dual use, low-risk dual use, and benign tiers. The post also introduces an early-draft jailbreak severity framework developed with Glasswing partners, intended to give AI developers and governments a shared vocabulary for describing jailbreak risk levels. Anthropic is soliciting public feedback on the framework and has launched a HackerOne bug bounty program for cyber jailbreaks in Fable 5. The disclosure is notable for its specificity about classifier design trade-offs, including the deliberate 'safety margin' that accepts higher false-positive rates to reduce harmful outputs.

Frontier Model Releases AI Safety Research HackerOne Claude Fable 5 Glasswing +2 more