6Anthropic News·17d ago

Anthropic partners with U.S. National Labs for 1,000 Scientist AI Jam evaluating Claude on scientific tasks

Anthropic is participating in the U.S. Department of Energy's first 1,000 Scientist AI Jam, bringing together scientists across multiple National Laboratories to evaluate frontier AI models on scientific research and national security applications. Claude 3.7 Sonnet, recently launched as the first hybrid reasoning model, will be a primary subject of evaluation across tasks including hypothesis generation, experiment planning, code generation, and result analysis. This builds on Anthropic's April 2024 collaboration with the National Nuclear Security Administration, which was the first instance of a frontier lab evaluating a model in a Top Secret classified environment. The partnership signals deepening government-industry collaboration on AI for scientific discovery and national security.

Frontier Model Releases AI Safety Research Enterprise Deployment Patterns National Nuclear Security Administration U.S. Department of Energy Claude 3.7 Sonnet Anthropic

Related guides (4)

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Anthropic

Anthropic: The AI Safety Company at the Center of the Frontier

Read asBeginner

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Enterprise Deployment PatternsTopic guide

Enterprise Deployment Patterns: From LLM Demo to Production Reality

Read asIn-depth

Related events (8)

7Anthropic News·19d ago·source ↗

Anthropic Partners with US Department of Energy on Genesis Mission for AI-Driven Scientific Discovery

Anthropic and the US Department of Energy have announced a multi-year partnership under the DOE's Genesis Mission initiative, targeting AI deployment across energy, biological sciences, and scientific productivity domains. The partnership will provide DOE researchers access to Claude and Anthropic engineers who will build purpose-built agents, Model Context Protocol servers, and specialized Claude Skills for scientific workflows. The collaboration has potential reach across all 17 US national laboratories and builds on prior work including a nuclear risk classifier with the National Nuclear Security Administration and Claude deployment at Lawrence Livermore. This represents a significant expansion of Anthropic's US government footprint.

Enterprise Deployment Patterns Regulatory Developments Claude Jared Kaplan National Nuclear Security Administration +7 more

7Anthropic News·19d ago·source ↗

Anthropic Partners with Allen Institute and HHMI to Deploy Claude in Frontier Life Sciences Research

Anthropic has announced flagship partnerships with the Allen Institute and Howard Hughes Medical Institute (HHMI) to embed Claude into active scientific workflows at both institutions. HHMI's collaboration, anchored at Janelia Research Campus, focuses on developing specialized AI agents integrated with scientific instruments and analysis pipelines. The Allen Institute partnership targets multi-agent systems for multi-modal biological data analysis, including multi-omic integration, knowledge graph management, and experimental design coordination. Both partnerships emphasize interpretability, researcher autonomy, and transparency, with the stated goal of compressing months of manual analysis while keeping human scientists in control of scientific direction.

AI Safety Research Enterprise Deployment Patterns AI@HHMI Janelia Research Campus Claude +4 more

7Anthropic News·16d ago·source ↗

Anthropic makes Claude 3 Haiku and Sonnet available to US Intelligence Community and AWS GovCloud

Anthropic has made Claude 3 Haiku and Claude 3 Sonnet available via AWS Marketplace for the US Intelligence Community and AWS GovCloud, marking a significant expansion into government deployment. The company has crafted contractual exceptions to its general Usage Policy to permit legally authorized foreign intelligence analysis, including combating human trafficking and identifying covert influence campaigns, while maintaining restrictions on disinformation, weapons design, and malicious cyber operations. The deployment is currently limited to ASL-2 models under Anthropic's Responsible Scaling Policy. Anthropic also notes prior pre-release access to Claude 3.5 Sonnet was provided to the UK AI Safety Institute for pre-deployment testing.

AI Safety Research Enterprise Deployment Patterns AWS GovCloud UK Artificial Intelligence Safety Institute Claude 3.5 Sonnet +8 more

6Anthropic News·19d ago·source ↗

How scientists are using Claude to accelerate research and discovery

Anthropic describes how researchers are deploying Claude-powered systems across scientific workflows, highlighting three case studies: Biomni (a Stanford agentic platform integrating hundreds of biomedical tools), the Cheeseman Lab (automating large-scale gene knockout experiment interpretation), and others. The piece details Claude for Life Sciences and the AI for Science program, which provides free API credits to high-impact research projects. Specific benchmarks cited include compressing months-long GWAS analyses to 20 minutes and analyzing 336,000 single-cell datasets to identify novel transcription factors.

Frontier Model Releases Enterprise Deployment Patterns Claude Opus 4.6 Stanford University Claude +9 more

9Anthropic News·17d ago·source ↗

Anthropic introduces computer use capability, upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku

Anthropic announced three major developments: an upgraded Claude 3.5 Sonnet with significant coding improvements (SWE-bench Verified rising from 33.4% to 49.0%, surpassing all publicly available models including reasoning models), a new Claude 3.5 Haiku that matches Claude 3 Opus performance at Haiku-tier speed, and a public beta of 'computer use' — a capability allowing Claude to control computers by viewing screens, moving cursors, clicking, and typing. Computer use is available via the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI, with early adopters including Replit, The Browser Company, and Cognition. Both safety institutes (US AISI and UK AISI) conducted pre-deployment testing, and the model was assessed as remaining within ASL-2 under Anthropic's Responsible Scaling Policy.

Frontier Model Releases Evaluation and Benchmarking OpenAI o1-preview Amazon Bedrock Claude 3.5 Sonnet +15 more

8Anthropic News·18d ago·source ↗

Introducing Claude 3.5 Sonnet

Anthropic launches Claude 3.5 Sonnet, the first model in its Claude 3.5 family, claiming it outperforms Claude 3 Opus and competitor models on GPQA, MMLU, and HumanEval benchmarks while operating at twice the speed and mid-tier pricing ($3/$15 per million tokens). The model features a 200K context window, improved vision capabilities, and an internal agentic coding evaluation score of 64% versus 38% for Opus. Alongside the model, Anthropic introduces Artifacts on Claude.ai, a dedicated workspace for real-time editing of AI-generated content. The model was pre-deployment evaluated by the UK AI Safety Institute and assessed at ASL-2.

Long Context Evolution Frontier Model Releases claude.ai Thorn Amazon Bedrock +16 more

8Anthropic News·17d ago·source ↗

Anthropic Frontier Red Team reports early-warning signs of rapid AI progress in cybersecurity and biosecurity capabilities

Anthropic's Frontier Red Team published findings from a year of safety evaluations across four model releases, documenting rapid capability gains in dual-use domains. In cybersecurity, Claude 3.7 Sonnet now solves roughly a third of Cybench CTF challenges (up from ~5% a year ago), and with the Incalmo toolset was able to replicate a large-scale network attack in realistic cyber range environments. In biosecurity, Claude has moved from underperforming virology experts to exceeding them on the VCT benchmark within one year, and exceeds human expert baselines on cloning workflows. Anthropic assesses current models as showing 'early warning' signs but not yet crossing thresholds of substantially elevated national security risk.

Frontier Model Releases Evaluation and Benchmarking Intercode CTF Carnegie Mellon University LabBench +7 more

9Anthropic News·19d ago·source ↗

Claude 3.7 Sonnet and Claude Code: Anthropic's First Hybrid Reasoning Model and Agentic Coding Tool

Anthropic has released Claude 3.7 Sonnet, described as their most capable model to date and the first hybrid reasoning model on the market, capable of operating in both standard and extended thinking modes within a single unified model. The model achieves state-of-the-art results on SWE-bench Verified and TAU-bench, with particular strength in coding and front-end web development. Alongside the model, Anthropic is launching Claude Code in limited research preview, a command-line agentic coding tool that can read/edit files, run tests, and push to GitHub. Pricing remains unchanged at $3/M input and $15/M output tokens, with availability across Claude.ai plans, Amazon Bedrock, and Google Cloud Vertex AI.

Frontier Model Releases Evaluation and Benchmarking Canva Amazon Bedrock GitHub +14 more