Almanac
technique

suicide and self-harm classifier

techniqueactiveprovisionalsuicide-and-self-harm-classifier-f34aa82c·1 events·first seen 15d ago

Aliases: suicide and self-harm classifier

Co-occurring entities

More like this (12)

Recent events (1)

6Anthropic News·15d ago·source ↗

Anthropic Details Safeguards for User Wellbeing: Crisis Detection, Anti-Sycophancy, and Evaluation Results

Anthropic has published a detailed account of its user wellbeing safeguards, covering how Claude handles suicide and self-harm conversations through model training, system prompts, and a real-time crisis classifier integrated with ThroughLine's global helpline network. The post discloses evaluation results for Claude Opus 4.5, Sonnet 4.5, and Haiku 4.5, showing 98–99% appropriate response rates on high-risk single-turn prompts and very low false-refusal rates on benign requests. Anthropic also addresses anti-sycophancy efforts and an 18+ age requirement for Claude.ai. The company is partnering with the International Association for Suicide Prevention (IASP) to further inform training and product design.