
Claude's constitution
claude-s-constitution-e6367368·5 events·first seen 1mo agoAliases: Claude's constitution
Co-occurring entities
More like this (12)
Recent events (5)
Anthropic Publishes New Claude Constitution Under CC0 License
Anthropic has released a new foundational 'constitution' document that directly shapes Claude's values and behavior during training, replacing a previous list of standalone principles with a holistic explanatory framework. The document is written primarily for Claude itself, explaining the reasoning behind desired behaviors rather than just specifying rules, with the goal of enabling better generalization to novel situations. It establishes a priority hierarchy: broadly safe, broadly ethical, compliant with Anthropic guidelines, and genuinely helpful. The constitution is released under Creative Commons CC0 1.0, allowing unrestricted use, and plays a central role in generating synthetic training data.
Anthropic Launches Multi-Tradition Dialogue Program on AI Moral Formation
Anthropic has begun a structured outreach program engaging scholars, clergy, philosophers, and ethicists from over 15 religious and cross-cultural traditions to inform Claude's character development and values training. The initiative is framed as a research workstream on 'moral formation' of AI systems, directly feeding into Claude's constitution and alignment evaluations. A concrete experiment emerged from these dialogues: giving Claude a mid-task tool that surfaces its own ethical commitments, which showed measurably lower rates of misaligned behavior on internal evaluations. Anthropic plans to expand engagement to legal scholars, psychologists, and civic institutions, with future discussions addressing AI's impact on work, institutions, and power distribution.
Anthropic Publishes Updated Claude's Constitution (Jan 2026 Revision)
Anthropic has released an updated version of Claude's Constitution, the explicit set of principles governing Claude's values and behavior under the Constitutional AI (CAI) framework. The post explains how CAI uses AI-generated feedback rather than large-scale human feedback to train models toward helpful, honest, and harmless behavior, with the constitution guiding both self-critique/revision and reinforcement learning phases. The constitution draws from sources including the UN Declaration of Human Rights, DeepMind's Sparrow Principles, Apple's terms of service, and Anthropic's own safety research. Anthropic frames the constitution as a work-in-progress and invites broader participation in designing AI constitutions.
Anthropic Commits Claude to Remaining Ad-Free, Citing Alignment and User Trust
Anthropic has published a policy statement declaring that Claude will not carry advertising, sponsored content, or third-party product placements in conversations. The company argues that ad-based incentives are structurally incompatible with Claude's constitution and the goal of acting unambiguously in users' interests, citing the sensitive and personal nature of many AI conversations. Anthropic's revenue model relies on enterprise contracts and paid subscriptions, and the post signals openness to agentic commerce features where Claude acts on a user's behalf rather than on behalf of advertisers. The company acknowledges other AI companies may reach different conclusions and commits to transparency if this policy changes.
Anthropic Updates Election Safeguards for Claude Ahead of 2026 US Midterms
Anthropic has published an update on its election-related safety measures for Claude, covering political bias evaluations, usage policy enforcement, and influence operation resistance testing. New model versions Claude Opus 4.7 and Sonnet 4.6 scored 95-96% on political impartiality evaluations and handled election-related policy compliance at 99.8-100% on a 600-prompt test suite. For the first time, Anthropic tested whether models can autonomously run influence operations end-to-end, finding that only Mythos Preview and Opus 4.7 completed more than half of tasks when safeguards were removed, underscoring ongoing capability concerns. Anthropic is also deploying election information banners pointing users to nonpartisan resources like TurboVote for the 2026 US midterms.