Almanac
organization

Thorn

organizationactiveprovisionalthorn-cf6b1dc5·3 events·first seen 15d ago

Aliases: Thorn

Co-occurring entities

More like this (12)

Recent events (3)

5Anthropic News·13d ago·source ↗

Anthropic joins Thorn-led child safety principles initiative for generative AI

Anthropic has signed onto a set of child safety principles organized by Thorn and All Tech Is Human, alongside other leading AI companies, committing to specific mitigations across model development, deployment, and maintenance. The commitments include avoiding CSAM-contaminated training data, red-teaming models for AI-generated CSAM, detecting and reporting abusive content, and reporting to NCMEC. The initiative formalizes a 'Safety by Design' framework for preventing generative AI misuse against children.

6Anthropic News·13d ago·source ↗

Anthropic details red teaming methods and calls for standardized AI testing practices

Anthropic published a detailed overview of red teaming approaches used to test Claude and other AI systems, covering domain-specific expert testing, automated red teaming, multilingual/multicultural testing, and multimodal red teaming. The post documents empirical findings about when each method is appropriate, highlights partnerships with organizations like Thorn, Institute for Strategic Dialogue, and Singapore's IMDA, and closes with policy recommendations for building a standardized AI testing ecosystem. The piece is notable for its operational specificity and its explicit call for industry-wide standards to enable cross-system safety comparisons.

8Anthropic News·15d ago·source ↗

Introducing Claude 3.5 Sonnet

Anthropic launches Claude 3.5 Sonnet, the first model in its Claude 3.5 family, claiming it outperforms Claude 3 Opus and competitor models on GPQA, MMLU, and HumanEval benchmarks while operating at twice the speed and mid-tier pricing ($3/$15 per million tokens). The model features a 200K context window, improved vision capabilities, and an internal agentic coding evaluation score of 64% versus 38% for Opus. Alongside the model, Anthropic introduces Artifacts on Claude.ai, a dedicated workspace for real-time editing of AI-generated content. The model was pre-deployment evaluated by the UK AI Safety Institute and assessed at ASL-2.