7Anthropic News·19d ago

Anthropic Publishes New Claude Constitution Under CC0 License

Anthropic has released a new foundational 'constitution' document that directly shapes Claude's values and behavior during training, replacing a previous list of standalone principles with a holistic explanatory framework. The document is written primarily for Claude itself, explaining the reasoning behind desired behaviors rather than just specifying rules, with the goal of enabling better generalization to novel situations. It establishes a priority hierarchy: broadly safe, broadly ethical, compliant with Anthropic guidelines, and genuinely helpful. The constitution is released under Creative Commons CC0 1.0, allowing unrestricted use, and plays a central role in generating synthetic training data.

Frontier Model Releases AI Safety Research Alignment and RLHF Creative Commons CC0 1.0 Constitutional AI Claude Claude's constitution Anthropic

Related guides (4)

Claude

Claude: Anthropic's AI Assistant Built for Safety and Scale

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

AI Safety ResearchTopic guide

AI Safety Research: From Lab Evals to Geopolitical Flashpoint

Read asIn-depth

Related events (8)

7Anthropic News·19d ago·source ↗

Anthropic Publishes Updated Claude's Constitution (Jan 2026 Revision)

Anthropic has released an updated version of Claude's Constitution, the explicit set of principles governing Claude's values and behavior under the Constitutional AI (CAI) framework. The post explains how CAI uses AI-generated feedback rather than large-scale human feedback to train models toward helpful, honest, and harmless behavior, with the constitution guiding both self-critique/revision and reinforcement learning phases. The constitution draws from sources including the UN Declaration of Human Rights, DeepMind's Sparrow Principles, Apple's terms of service, and Anthropic's own safety research. Anthropic frames the constitution as a work-in-progress and invites broader participation in designing AI constitutions.

Evaluation and Benchmarking AI Safety Research DeepMind Constitutional AI Claude +7 more

7Anthropic News·19d ago·source ↗

Anthropic Publishes Political Even-Handedness Evaluation for Claude, Open-Sources Methodology

Anthropic has released a detailed account of how it trains and evaluates Claude for political even-handedness, including character traits instilled via reinforcement learning since early 2024 and a new automated evaluation methodology. The evaluation tests thousands of prompts across hundreds of political stances and benchmarks Claude Sonnet 4.5 against GPT-5, Llama 4, Grok 4, and Gemini 2.5 Pro, finding Claude comparable to Grok 4 and Gemini 2.5 Pro and more even-handed than GPT-5 and Llama 4. Anthropic is open-sourcing the evaluation framework to encourage shared industry standards for measuring political bias. The post also discloses the specific system prompt language used on Claude.ai to enforce even-handed behavior.

Frontier Model Releases Evaluation and Benchmarking claude.ai Claude Sonnet 4.5 Grok 4 +8 more

5Anthropic News·19d ago·source ↗

Anthropic Commits Claude to Remaining Ad-Free, Citing Alignment and User Trust

Anthropic has published a policy statement declaring that Claude will not carry advertising, sponsored content, or third-party product placements in conversations. The company argues that ad-based incentives are structurally incompatible with Claude's constitution and the goal of acting unambiguously in users' interests, citing the sensitive and personal nature of many AI conversations. Anthropic's revenue model relies on enterprise contracts and paid subscriptions, and the post signals openness to agentic commerce features where Claude acts on a user's behalf rather than on behalf of advertisers. The company acknowledges other AI companies may reach different conclusions and commits to transparency if this policy changes.

AI Safety Research Enterprise Deployment Patterns Claude Claude's constitution Anthropic +1 more

7Anthropic News·17d ago·source ↗

Anthropic launches Claude publicly with two model tiers after closed alpha

Anthropic announced the public launch of Claude on March 14, 2023, following a closed alpha with partners including Notion, Quora, and DuckDuckGo. The release introduced two model variants — Claude (high-performance) and Claude Instant (lighter and faster) — accessible via chat interface and API. Early partners reported Claude produced fewer harmful outputs and was more steerable than competing models, with deployments spanning education, legal tech, productivity, and search.

Frontier Model Releases Enterprise Deployment Patterns Quora Notion Poe +7 more

7Latent Space·10d ago·source ↗

Anthropic Claude Fable 5 (Mythos) launches with controversial usage policies

Anthropic released a new Mythos-class model, Claude Fable 5, which appears to be a significant capability release. The launch was accompanied by controversial usage terms that drew community attention and criticism. The item is a newsletter summary from Latent Space covering the release and its reception.

Frontier Model Releases AI Safety Research Claude Fable 5 Latent Space Anthropic

8Hacker News·11d ago·source ↗

Anthropic releases system card for Claude Fable 5 and Claude Mythos 5

Anthropic has published a system card PDF for two new models, Claude Fable 5 and Claude Mythos 5, surfaced via Hacker News with 211 points. The system card is a primary safety and capability disclosure document accompanying a model release. The naming convention suggests these are new frontier-tier models from Anthropic, distinct from the existing Claude Opus/Sonnet/Haiku naming scheme.

Frontier Model Releases AI Safety Research Claude Mythos Claude Fable 5 Anthropic

5Anthropic News·1mo ago·source ↗

Anthropic Launches Multi-Tradition Dialogue Program on AI Moral Formation

Anthropic has begun a structured outreach program engaging scholars, clergy, philosophers, and ethicists from over 15 religious and cross-cultural traditions to inform Claude's character development and values training. The initiative is framed as a research workstream on 'moral formation' of AI systems, directly feeding into Claude's constitution and alignment evaluations. A concrete experiment emerged from these dialogues: giving Claude a mid-task tool that surfaces its own ethical commitments, which showed measurably lower rates of misaligned behavior on internal evaluations. Anthropic plans to expand engagement to legal scholars, psychologists, and civic institutions, with future discussions addressing AI's impact on work, institutions, and power distribution.

AI Safety Research Alignment and RLHF Claude Claude's constitution ethical commitment reminder tool +1 more

7Anthropic News·16d ago·source ↗

Anthropic makes Claude 3 Haiku and Sonnet available to US Intelligence Community and AWS GovCloud

Anthropic has made Claude 3 Haiku and Claude 3 Sonnet available via AWS Marketplace for the US Intelligence Community and AWS GovCloud, marking a significant expansion into government deployment. The company has crafted contractual exceptions to its general Usage Policy to permit legally authorized foreign intelligence analysis, including combating human trafficking and identifying covert influence campaigns, while maintaining restrictions on disinformation, weapons design, and malicious cyber operations. The deployment is currently limited to ASL-2 models under Anthropic's Responsible Scaling Policy. Anthropic also notes prior pre-release access to Claude 3.5 Sonnet was provided to the UK AI Safety Institute for pre-deployment testing.

AI Safety Research Enterprise Deployment Patterns AWS GovCloud UK Artificial Intelligence Safety Institute Claude 3.5 Sonnet +8 more