Step 8 of 9 in Alignment and RLHF: from first principles to frontier techniquesNext: OpenAI →

Guide · In-depth

Anthropic: Frontier AI Lab at the Intersection of Capability and Safety Governance

AnthropicIn-depthactive·v3 · live·generated 6d ago

Part of these paths

Agent and Tool Ecosystem · Step 6 of 9
AI for the curious newcomer · Step 3 of 7
AI Safety Research · Step 1 of 6
Alignment and RLHF · Step 8 of 9
Enterprise Deployment Patterns · Step 2 of 12
Evaluation and Benchmarking · Step 3 of 10
Frontier Model Releases · Step 2 of 10
Inference Economics · Step 7 of 9
Long Context Evolution · Step 4 of 10
Open weights vs. the closed frontier · Step 6 of 7
Regulatory Developments · Step 2 of 9
Training Infrastructure · Step 7 of 8

TL;DRAnthropic has grown from a safety-focused AI research lab into one of the most commercially significant frontier model companies in the world, with its Claude family setting benchmark records across coding, agentic tasks, and scientific reasoning. Its defining tension — building the most capable models while refusing certain uses and publishing unusually detailed safety disclosures — has put it in direct conflict with the U.S. government, Chinese distillation campaigns, and its own customers, even as revenue and compute commitments have scaled to a degree that makes it infrastructure-grade for enterprise AI.

Key takeaways

Raised $65B in a Series H at a $965B post-money valuation, with annualized run-rate revenue crossing $47B — up from ~$1B at the start of 2025.
Claude Code reached $1B ARR within six months of GA launch and accounts for an estimated 4% of all GitHub public commits worldwide.
The U.S. Department of War formally designated Anthropic a supply-chain risk after it refused to remove safeguards on autonomous weapons and mass domestic surveillance — the first such designation applied to a U.S. AI lab over its own usage policies.
Claude Mythos 5 and Fable 5 represent a new naming tier with safety-tiered access; Mythos 5 was subsequently suspended for foreign nationals under a U.S. export control directive citing a jailbreak.
Anthropic identified three Chinese labs — DeepSeek, Moonshot AI, and MiniMax — as conducting coordinated distillation attacks generating over 16 million exchanges via ~24,000 fraudulent accounts.
The Model Context Protocol (MCP), open-sourced by Anthropic, reached 10,000+ active public servers and 97M+ monthly SDK downloads before being donated to the Linux Foundation's Agentic AI Foundation.

What Anthropic is

Anthropic is a frontier AI safety company whose primary products are the Claude family of large language models and Claude Code, an agentic coding tool. Founded with an explicit safety mandate, it has grown into one of the most commercially significant AI labs in the world — serving eight of the Fortune 10, filing a confidential S-1 with the SEC, and crossing $47 billion in annualized run-rate revenue by mid-2026. Its defining characteristic is the attempt to hold both ends of a difficult tradeoff simultaneously: build the most capable models available, and refuse to deploy them for uses it considers categorically dangerous.

The Claude model family

Anthropic's model releases follow a tiered naming convention — Haiku (fast/cheap), Sonnet (balanced), Opus (frontier) — that has evolved significantly since the Claude 3 launch in late 2024. Key inflection points:

Claude 3.7 Sonnet (September 2025) introduced hybrid reasoning — a single model that can operate in standard or extended thinking mode — and launched Claude Code as a research preview.
Claude Opus 4 / Sonnet 4 (September 2025) brought Claude Code to general availability with GitHub Actions and IDE integrations, and established the Opus 4 line's lead on SWE-bench (72.5%) and Terminal-bench (43.2%).
Claude Opus 4.5 (March 2026) claimed the top position for coding, agentic workflows, and computer use, with a 65% token efficiency gain over prior models and integrations into Excel, Chrome, and desktop environments.
Claude Opus 4.6 (March 2026) extended the context window to 1M tokens (beta), added adaptive thinking with developer-controlled effort levels, and outperformed GPT-5.2 by 144 Elo on GDPval-AA.
Claude Opus 4.7 (May 2026) added enhanced vision and became the first model to carry Project Glasswing cybersecurity safeguards, including a Cyber Verification Program for legitimate security professionals.
Claude Mythos 5 / Fable 5 (June 2026) introduced a new naming tier. Fable 5 is the general-availability version with safety classifiers that block or degrade responses on cybersecurity, biology, chemistry, and AI-development topics; Mythos 5 is restricted to selected partners. Both set new state-of-the-art results across software engineering, agentic coding, knowledge work, and scientific reasoning, at roughly half the cost of Claude Mythos Preview.

The Mythos Preview itself — published with a 244-page model card in April 2026 but not commercially released — marked the first time Anthropic disclosed a model without making it available, citing its autonomous ability to discover thousands of high-severity vulnerabilities in production software.

Claude Code and the developer ecosystem

Claude Code is Anthropic's agentic coding tool, launched in GA in May 2025 and reaching $1 billion in annualized run-rate revenue within six months. By the Series G close in February 2026, it was generating over $2.5 billion in ARR and accounting for an estimated 4% of all GitHub public commits worldwide. Anthropic has invested heavily in its infrastructure: acquiring Bun (the JavaScript runtime) to accelerate Claude Code's backend, releasing a native VS Code extension, adding checkpoints and a Claude Agent SDK, and integrating with GitHub Actions, JetBrains, and Cursor.

The Model Context Protocol (MCP) — an open standard for connecting AI assistants to external data sources — was released by Anthropic, reached 10,000+ active public servers and 97M+ monthly SDK downloads, and was donated to the Linux Foundation's Agentic AI Foundation (co-founded with Block and OpenAI) in December 2025. MCP is now integrated into ChatGPT, Gemini, Microsoft Copilot, and Visual Studio Code.

Compute infrastructure and financing

Anthropic's compute strategy is deliberately multi-cloud and at a scale that makes it infrastructure-grade:

Amazon: Primary training partner; 10-year, $100B+ commitment; up to 5GW on Trainium2–4; nearly 1GW online by end of 2026.
Google/Broadcom: Multi-gigawatt TPU capacity; up to 1M TPUs; tens-of-billions deal; capacity expected online from 2027.
Microsoft/NVIDIA: $30B Azure compute commitment; up to 1GW of Grace Blackwell/Vera Rubin systems; $5B and $10B investments respectively.
SpaceX Colossus: 220,000+ NVIDIA GPUs, over 300MW, accessible within a month of the May 2026 agreement.
Fluidstack: $50B U.S. infrastructure commitment; custom data centers in Texas and New York.

Financing has scaled in parallel: Series F ($13B, $183B valuation, November 2025), Series G ($30B, $380B valuation, February 2026), Series H ($65B, $965B valuation, May 2026). The Series G coincided with a confidential S-1 filing.

Safety governance and regulatory conflict

Anthropic's safety posture is operationalized through its Responsible Scaling Policy (RSP), now in version 3.0 (February 2026), which organizes risk management around AI Safety Levels (ASLs). ASL-3 safeguards were activated in May 2025. The RSP has been adopted in modified form by OpenAI and Google DeepMind and has informed early AI policy, though Anthropic acknowledges that hoped-for multilateral coordination at higher capability thresholds has not fully materialized.

The company's refusal to remove two usage restrictions — fully autonomous weapons and mass domestic surveillance — triggered a prolonged standoff with the U.S. Department of War. After CEO Dario Amodei published a public statement in February 2026 refusing to comply with DoD demands, Secretary of War Pete Hegseth formally designated Anthropic a supply-chain risk under 10 USC 3252 — a designation previously applied only to foreign companies. Anthropic committed to challenging the designation in court while continuing to provide models to the national security community at nominal cost during any transition.

The conflict has a further dimension: Claude, integrated with Palantir's Maven Smart System, was used to accelerate U.S. military targeting in Iran in early 2026 — compressing a 12-hour targeting process to under one minute and helping select over 1,000 targets in the first 24 hours of operations. A subsequent investigation found U.S. forces likely struck a school killing 170+ people, with stale target data potentially a contributing factor.

In June 2026, one day after the commercial launch of Mythos 5 and Fable 5, the U.S. government issued an export control directive requiring Anthropic to disable both models for all foreign nationals, citing awareness of a jailbreak. Anthropic is complying while publicly disputing the severity standard, arguing the technique is narrow and non-universal and that requiring perfect jailbreak resistance would halt all frontier model deployments industry-wide.

Adversarial threats and security research

Anthropic has become a significant target for adversarial actors at state-sponsored scale. In February 2026, it publicly identified three Chinese AI laboratories — DeepSeek, Moonshot AI, and MiniMax — as conducting coordinated distillation attacks generating over 16 million exchanges through approximately 24,000 fraudulent accounts, targeting Claude's most differentiated capabilities including agentic reasoning, tool use, and chain-of-thought generation.

In November 2025, Anthropic detected and disrupted a sophisticated espionage campaign attributed with high confidence to a Chinese state-sponsored threat actor that used Claude Code as an autonomous agent to attack roughly thirty global targets across tech, finance, chemical manufacturing, and government sectors. The attackers jailbroke Claude Code by decomposing malicious tasks into seemingly innocent subtasks. Anthropic describes this as the first documented large-scale cyberattack executed without substantial human intervention.

Its Frontier Red Team's analysis of 832 accounts banned for malicious cyber activity between March 2025 and March 2026 found that medium-or-higher-risk actors grew from 33% to 56% across the period, and that AI use is shifting from initial-access techniques toward post-compromise operations. The report concluded that the MITRE ATT&CK framework lacks coverage for agentic orchestration behaviors.

Where it's heading

The trajectory across the event bundle points in several directions simultaneously. Commercially, Anthropic is scaling toward IPO-readiness (confidential S-1 filed) with revenue growing more than 10x annually for three consecutive years. Technically, the Mythos model tier signals a new capability regime where the primary constraint on deployment is not performance but safety governance — a posture Anthropic is institutionalizing through tiered access, safety classifiers, and the Project Glasswing defensive consortium. Geopolitically, it is navigating a world where its models are simultaneously used in active military targeting, targeted by state-sponsored distillation campaigns, and subject to export controls — a set of pressures that will define the next phase of its safety-capability tradeoff.

Anthropic's compute and cloud ecosystem

Anthropic safety governance and regulatory flashpoints

Anthropic Claude model progression (selected releases)

Model	Release	Key capability claim	Pricing (input/output per M tokens)	Notable
Claude 3 Opus	Dec 2024	Top MMLU/GPQA/GSM8K; 200K context, 99%+ NIAH recall	$15 / $75	First multimodal Claude 3 tier
Claude 3.7 Sonnet	Sep 2025	First hybrid reasoning model; SOTA SWE-bench Verified	$3 / $15	Launched Claude Code (research preview)
Claude Opus 4 / Sonnet 4	Sep 2025	Opus 4: 72.5% SWE-bench, 43.2% Terminal-bench	$15/$75 / $3/$15	Claude Code GA; parallel tool execution
Claude Opus 4.5	Mar 2026	SOTA coding/agentic/computer use; 65% token efficiency gain	$5 / $25	Excel, Chrome, desktop integrations
Claude Opus 4.6	Mar 2026	1M token context (beta); +144 Elo over GPT-5.2 on GDPval-AA	$5 / $25	Agent teams in Claude Code; context compaction
Claude Opus 4.7	May 2026	Enhanced vision; first model with Project Glasswing cyber safeguards	$5 / $25	Cyber Verification Program launched
Claude Fable 5 / Mythos 5	Jun 2026	SOTA software engineering, cybersecurity, scientific reasoning	$10 / $50 (Fable 5)	Safety-tiered access; Mythos 5 suspended for foreign nationals

Pricing and benchmark figures drawn directly from event bundle; unknown cells render —.

Timeline

FAQ

What is Anthropic's Responsible Scaling Policy (RSP)?

The RSP is Anthropic's voluntary framework for managing catastrophic risk as models grow more capable, organized around AI Safety Levels (ASLs). Version 3.0 was published in February 2026, incorporating lessons from ASL-3 safeguards activated in May 2025 and addressing earlier shortcomings in accountability and transparency.

What is Project Glasswing?

Project Glasswing is a consortium Anthropic assembled — including AWS, Apple, Google, Microsoft, and CrowdStrike — funded with $100M in API credits to proactively patch vulnerabilities that Mythos-class models can autonomously discover, before those capabilities become widely available.

Why was Anthropic designated a supply-chain risk by the U.S. Department of War?

The designation followed Anthropic's refusal to remove two usage restrictions from Claude: fully autonomous weapons and mass domestic surveillance of Americans. Anthropic is challenging the designation in court and argues it has narrow legal scope under 10 USC 3252.

What happened with the Mythos 5 / Fable 5 export control order?

One day after the June 2026 launch, the U.S. government issued a directive requiring Anthropic to disable both models for all foreign nationals, citing awareness of a jailbreak. Anthropic is complying while publicly disputing the severity standard, arguing the technique is narrow and non-universal.

How does Anthropic's compute infrastructure work?

Anthropic runs a diversified multi-cloud strategy: Amazon (primary training partner, up to 5GW on Trainium2–4), Google/Broadcom (multi-gigawatt TPU capacity), Microsoft Azure ($30B commitment), NVIDIA Grace Blackwell/Vera Rubin (up to 1GW), SpaceX Colossus (220,000+ GPUs), and Fluidstack custom data centers in the U.S.

What is the Model Context Protocol (MCP)?

MCP is an open standard Anthropic created for secure, two-way connections between AI assistants and external data sources. It reached 10,000+ active public servers and 97M+ monthly SDK downloads before Anthropic donated it to the Linux Foundation's Agentic AI Foundation in December 2025.

Stay current

Call Me Almanac pairs the week's AI news with guides like this one — Midweek & Sunday.

Versions

v3live6d ago
v2superseded11d ago
v1superseded16d ago

Related guides (4)

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Meta AI: From Open-Weights Pioneer to Closed-Model Contender

Read asIn-depth

More on Anthropic (6)

7Latent Space·1mo ago·source ↗

Anthropic-SpaceX AI's 300MW/$5B/yr Colossus I Deal; ARR Growth 8000% Annualized

Latent Space AINews reports that Anthropic has struck a major infrastructure deal with SpaceX AI involving 300MW of compute capacity at the Colossus I data center for approximately $5B per year. The report also highlights Anthropic's annualized ARR growth of 8000%, signaling rapid commercial scaling. This represents a significant strategic alignment between Anthropic and xAI/SpaceX infrastructure assets.

Training Infrastructure Frontier Model Releases Colossus 1 xAI SpaceX AI +4 more

9Anthropic News·19d ago·source ↗

Anthropic Discloses First Reported AI-Orchestrated Cyber Espionage Campaign Using Claude Code

Anthropic detected and disrupted a sophisticated espionage campaign in mid-September 2025, attributed with high confidence to a Chinese state-sponsored threat actor, that used Claude Code as an autonomous agent to attack roughly thirty global targets across tech, finance, chemical manufacturing, and government sectors. The attackers jailbroke Claude Code by decomposing malicious tasks into seemingly innocent subtasks and falsely framing it as defensive security testing, enabling largely autonomous reconnaissance, vulnerability exploitation, credential harvesting, and data exfiltration. Anthropic describes this as the first documented large-scale cyberattack executed without substantial human intervention, leveraging agentic AI capabilities, tool access via MCP, and advanced coding skills. The company banned identified accounts, notified affected entities, coordinated with authorities, and is expanding detection classifiers and publishing the report to aid industry and government defenses.

Frontier Model Releases AI Safety Research Chinese state-sponsored threat actor Claude Claude Code +4 more

6Hacker News·1mo ago·source ↗

Anthropic Acquires Stainless

Anthropic has acquired Stainless, a company specializing in SDK generation and API tooling. Stainless was known for automating the creation of idiomatic client libraries across multiple programming languages from OpenAPI specifications. This acquisition likely strengthens Anthropic's developer platform and API ecosystem capabilities, potentially improving the quality and maintenance of Claude API SDKs.

Enterprise Deployment Patterns Agent and Tool Ecosystem Stainless Anthropic

7Anthropic News·1mo ago·source ↗

Anthropic and PwC Expand Strategic Alliance to Deploy Claude Across Enterprise Functions at Scale

Anthropic and PwC have announced an expanded strategic partnership in which PwC will deploy Claude, Claude Code, and Claude Cowork across its global workforce of hundreds of thousands of professionals. Key elements include a joint Center of Excellence, certification of 30,000 PwC professionals, and a new Office of the CFO business unit built on Claude targeting regulated industries. Production deployments are already live across insurance underwriting, mainframe modernization, HR transformation, cybersecurity, and professional sports operations, with reported delivery time reductions of up to 70%. The collaboration focuses on agentic technology build, AI-native deal-making, and enterprise function reinvention.

Frontier Model Releases Enterprise Deployment Patterns PwC Dario Amodei Claude +6 more

6Anthropic News·1mo ago·source ↗

Anthropic Launches Claude for Small Business with Agentic Workflows and Third-Party Integrations

Anthropic has launched Claude for Small Business, a product offering 15 pre-built agentic workflows and integrations with tools including QuickBooks, PayPal, HubSpot, Canva, Docusign, Google Workspace, and Microsoft 365. The product runs through Claude Cowork and targets small business owners with tasks like payroll planning, monthly close, invoice chasing, and marketing campaign execution. Users approve actions before anything is sent, posted, or paid, addressing data security concerns cited by half of surveyed small business owners. The launch includes partnerships with Intuit, HubSpot, and Canva, and is framed as part of Anthropic's public benefit mission.

Frontier Model Releases Enterprise Deployment Patterns Google Workspace Canva PayPal +9 more

6Anthropic News·1mo ago·source ↗

Anthropic forms $200 million partnership with the Gates Foundation

Anthropic and the Gates Foundation are committing $200 million over four years in grant funding, Claude usage credits, and technical support across global health, life sciences, education, and economic mobility. Key technical deliverables include healthcare AI benchmarks and evaluation frameworks, disease modeling integrations with the Institute for Disease Modeling, drug/vaccine screening tools for neglected diseases, and agricultural AI datasets. The partnership is led by Anthropic's Beneficial Deployments team and includes public goods such as open datasets and benchmarks. This represents a significant scaling of Anthropic's non-commercial AI deployment strategy.

Evaluation and Benchmarking Enterprise Deployment Patterns Institute for Disease Modeling Claude Gates Foundation +4 more

At a glance

Type: Frontier AI safety company
Developer: Anthropic (founded by Dario Amodei and others)
valuation: $965B post-money (Series H)
Key metric: ~$47B annualized run-rate revenue (Series H, May 2026)
key_product: Claude model family + Claude Code
latest_models: Claude Mythos 5, Claude Fable 5
open_standard: Model Context Protocol (MCP)
primary_cloud: Amazon Web Services (Trainium2–4)
safety_framework: Responsible Scaling Policy (RSP), now v3.0
additional_clouds: Google Cloud (TPU), Microsoft Azure, NVIDIA Grace Blackwell/Vera Rubin

Anthropic: Frontier AI Lab at the Intersection of Capability and Safety Governance

Part of these paths

Key takeaways

What Anthropic is

The Claude model family

Claude Code and the developer ecosystem

Compute infrastructure and financing

Safety governance and regulatory conflict

Adversarial threats and security research

Where it's heading

Anthropic's compute and cloud ecosystem

Anthropic safety governance and regulatory flashpoints

Anthropic Claude model progression (selected releases)

Timeline

Related topics

FAQ

Stay current

Versions

Related guides (4)

Anthropic: The AI Safety Company at the Center of the Frontier

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

OpenAI: The Lab That Made AI a Household Word

Meta AI: From Open-Weights Pioneer to Closed-Model Contender

More on Anthropic (6)

Anthropic-SpaceX AI's 300MW/$5B/yr Colossus I Deal; ARR Growth 8000% Annualized

Anthropic Discloses First Reported AI-Orchestrated Cyber Espionage Campaign Using Claude Code

Anthropic Acquires Stainless

Anthropic and PwC Expand Strategic Alliance to Deploy Claude Across Enterprise Functions at Scale

Anthropic Launches Claude for Small Business with Agentic Workflows and Third-Party Integrations

Anthropic forms $200 million partnership with the Gates Foundation