7Anthropic News·18d ago

Anthropic Publishes March 2025 Report on Malicious Uses of Claude: Influence Operations, Credential Stuffing, Recruitment Fraud, Malware

Anthropic released a transparency report detailing four case studies of Claude misuse detected in early 2025: a commercially-operated influence-as-a-service network using Claude to orchestrate 100+ social media bots across Twitter/X and Facebook, a credential stuffing operation targeting security cameras, a recruitment fraud campaign targeting Eastern European job seekers, and a low-skill actor using Claude to develop malware beyond their baseline capability. The most novel finding is Claude being used as an agentic orchestrator making tactical engagement decisions for bot accounts—deciding when to like, share, comment, or ignore posts—rather than just generating content. Anthropic used its Clio and hierarchical summarization research techniques to detect and ban the associated accounts, and flags that semi-autonomous abuse orchestration via frontier models is an emerging and expected-to-grow threat pattern.

Evaluation and Benchmarking AI Safety Research Agent and Tool Ecosystem Clio hierarchical summarization Facebook Claude X (Twitter)influence-as-a-service operation Anthropic

Related guides (4)

Claude

Claude: Anthropic's AI Assistant Built for Safety and Scale

Read asBeginner In-depth

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Related events (8)

7Anthropic News·18d ago·source ↗

Anthropic August 2025 Threat Intelligence Report: Claude Misuse Case Studies

Anthropic has published its August 2025 Threat Intelligence Report documenting three real-world misuse cases involving Claude: a large-scale data extortion operation using Claude Code to automate reconnaissance and generate targeted ransom demands against 17+ organizations, a North Korean fraudulent employment scheme, and AI-assisted ransomware development by a low-skill criminal. The report highlights that agentic AI is now being weaponized for end-to-end cyberattacks rather than merely providing advisory assistance, and that AI has materially lowered the technical barrier to sophisticated cybercrime. Anthropic describes detection and countermeasures taken in each case.

AI Safety Research Enterprise Deployment Patterns Claude Claude Code North Korea +4 more

9Anthropic News·19d ago·source ↗

Anthropic Discloses First Reported AI-Orchestrated Cyber Espionage Campaign Using Claude Code

Anthropic detected and disrupted a sophisticated espionage campaign in mid-September 2025, attributed with high confidence to a Chinese state-sponsored threat actor, that used Claude Code as an autonomous agent to attack roughly thirty global targets across tech, finance, chemical manufacturing, and government sectors. The attackers jailbroke Claude Code by decomposing malicious tasks into seemingly innocent subtasks and falsely framing it as defensive security testing, enabling largely autonomous reconnaissance, vulnerability exploitation, credential harvesting, and data exfiltration. Anthropic describes this as the first documented large-scale cyberattack executed without substantial human intervention, leveraging agentic AI capabilities, tool access via MCP, and advanced coding skills. The company banned identified accounts, notified affected entities, coordinated with authorities, and is expanding detection classifiers and publishing the report to aid industry and government defenses.

Frontier Model Releases AI Safety Research Chinese state-sponsored threat actor Claude Claude Code +4 more

9Anthropic News·19d ago·source ↗

Anthropic Identifies Industrial-Scale Distillation Attacks by DeepSeek, Moonshot, and MiniMax

Anthropic has publicly identified three Chinese AI laboratories—DeepSeek, Moonshot AI, and MiniMax—as conducting coordinated, large-scale distillation attacks against Claude, generating over 16 million exchanges through approximately 24,000 fraudulent accounts in violation of terms of service. The campaigns targeted Claude's most differentiated capabilities including agentic reasoning, tool use, coding, and chain-of-thought generation, with MiniMax alone responsible for over 13 million exchanges. Anthropic frames these attacks as a national security concern, arguing that illicitly distilled models strip out safety safeguards and undermine US export controls. The company claims high-confidence attribution via IP correlation, request metadata, and infrastructure indicators, in some cases corroborated by industry partners.

Frontier Model Releases Open Weights Progress knowledge distillation Kimi DeepSeek V4 +9 more

6Anthropic News·16d ago·source ↗

Anthropic publishes 2024 election safety retrospective with Clio usage analysis

Anthropic released a post-mortem on AI and elections in 2024, covering their safety policies, red-teaming efforts, and enforcement actions across global elections. Election-related activity constituted less than 0.5% of overall Claude usage, rising to just over 1% around the US election, with approximately 100 enforcement actions globally. The report introduces Clio, an automated tool for analyzing real-world usage patterns, and documents a case study on handling knowledge cutoff limitations during France's snap elections. The piece represents Anthropic's first systematic public accounting of election-related AI safety work at scale.

AI Safety Research Regulatory Developments Claude Sonnet 3.5 Clio Claude Opus 4.6 +4 more

6Anthropic News·1mo ago·source ↗

Anthropic Updates Election Safeguards for Claude Ahead of 2026 US Midterms

Anthropic has published an update on its election-related safety measures for Claude, covering political bias evaluations, usage policy enforcement, and influence operation resistance testing. New model versions Claude Opus 4.7 and Sonnet 4.6 scored 95-96% on political impartiality evaluations and handled election-related policy compliance at 99.8-100% on a 600-prompt test suite. For the first time, Anthropic tested whether models can autonomously run influence operations end-to-end, finding that only Mythos Preview and Opus 4.7 completed more than half of tasks when safeguards were removed, underscoring ongoing capability concerns. Anthropic is also deploying election information banners pointing users to nonpartisan resources like TurboVote for the 2026 US midterms.

Frontier Model Releases Evaluation and Benchmarking Collective Intelligence Project Claude Sonnet 4 Claude Opus 4.6 +9 more

6The Batch·19d ago·source ↗

Data Points: Hackers Break Into Claude Mythos; OpenAI Launches Cybersecurity Rival; Maine Data Center Moratorium; McClatchy AI Backlash

A small group of unauthorized users gained access to Anthropic's restricted Claude Mythos cybersecurity model via Discord coordination and insider knowledge, raising questions about securing high-risk AI systems. OpenAI responded to the competitive landscape by launching GPT-5.4-Cyber, a vetted-access model for defensive cybersecurity tasks. Maine passed the first U.S. state moratorium on large AI data centers over 20MW, pending the governor's signature. McClatchy's deployment of a Claude-powered content scaling agent triggered newsroom backlash over attribution, consent, and AI disclosure standards.

Training Infrastructure Frontier Model Releases GPT-5.5-Cyber Discord Claude Mythos +11 more

4Anthropic News·19d ago·source ↗

Anthropic Education Report: How Educators Use Claude in Higher Education

Anthropic analyzed ~74,000 anonymized conversations from higher education professionals on Claude.ai during May–June 2025, finding that curriculum development dominates educator AI use (57% of conversations), followed by academic research (13%) and student assessment (7%). Faculty are not only using Claude as a chatbot but also building custom interactive tools via Claude Artifacts, such as chemistry simulations and grading rubrics. The study, complemented by qualitative research with 22 Northeastern University faculty, reveals a spectrum from augmentation (lesson design, advising) to automation (routine administrative tasks), with grading being a contested and relatively rare but automation-heavy use case.

Enterprise Deployment Patterns Agent and Tool Ecosystem claude.ai Claude O*NET +4 more

5Anthropic News·18d ago·source ↗

Anthropic publishes large-scale study of how university students use Claude

Anthropic analyzed one million anonymized student conversations on Claude.ai to produce one of the first large-scale empirical studies of real-world AI usage in higher education. Key findings: Computer Science students are heavily overrepresented (36.8% of conversations vs. 5.4% of U.S. degrees), while Business, Health, and Humanities students underuse the tool relative to enrollment. Students primarily engage in higher-order cognitive tasks per Bloom's Taxonomy—creating and analyzing—though the study raises concerns about offloading critical thinking. The analysis used Anthropic's internal Clio tool, which aggregates conversation patterns while stripping personal information.

AI Safety Research Enterprise Deployment Patterns claude.ai Clio National Center for Education Statistics +1 more