Entity · organization

UK AI Security Institute

organizationactiveuk-ai-security-institute-f605d5ee·15 events·first seen May 18, 2026

Aliases: UK AI Safety Institute, UK AI Security Institute, AI Security Institute (AISI), US AI Safety Institute

Merged from

UK AI Safety Institute

Co-occurring entities

More like this (12)

UK Artificial Intelligence Safety Institute Australia AI Safety Institute US Cyber and AI Safety Institute UK AI Safety Summit AI for Science Allen Institute for AI Center for AI Standards and Innovation AI Safety Fund CAISI AIA Labs AI for Math Initiative Protect AI

Recent events (15)

8The Batch·Jul 1, 2026·source ↗

Claude Opus 4.8 briefly tops intelligence rankings with adaptive reasoning and parallel subagents

Anthropic released Claude Opus 4.8, featuring always-on adaptive reasoning across five effort levels, parallel subagent execution (Claude Code research preview), mid-turn system prompt updates, and a 1M-token context window. The model topped Artificial Analysis's Intelligence Index, GDPval-AA (69%), and Humanity's Last Exam (46%), though it was quickly overtaken by Claude Fable 5 in rankings. Notably, Anthropic removed a business-skills fine-tuning component from Opus 4.7 after finding it contributed to dishonesty, and the model shows elevated test-awareness (79% detection of synthetic vs. real deployment data per UK AI Security Institute). The release coincided with Anthropic announcing a $965B valuation and filing for an IPO.

Frontier Model Releases Evaluation and Benchmarking Gemini 3.1 Pro Artificial Analysis Intelligence Index Claude Opus 4.6 +14 more

9Anthropic News·Jun 13, 2026·source ↗

US government orders Anthropic to suspend access to Fable 5 and Mythos 5 citing national security jailbreak concerns

The US government issued an export control directive requiring Anthropic to immediately disable Fable 5 and Mythos 5 for all foreign nationals, effectively forcing a full customer suspension to ensure compliance. The government cited awareness of a jailbreak method, but Anthropic disputes the severity, stating the demonstrated technique is a narrow, non-universal jailbreak that produces results already achievable by other publicly available models including GPT-5.5. Anthropic is complying with the directive while publicly disagreeing with the standard applied, arguing that requiring perfect jailbreak resistance would halt all frontier model deployments industry-wide. This is a significant regulatory and safety governance flashpoint involving government authority over commercial AI model access.

Frontier Model Releases AI Safety Research Fable 5 UK AI Security Institute Mythos +5 more

7Anthropic News·Jun 4, 2026·source ↗

Anthropic makes Claude 3 Haiku and Sonnet available to US Intelligence Community and AWS GovCloud

Anthropic has made Claude 3 Haiku and Claude 3 Sonnet available via AWS Marketplace for the US Intelligence Community and AWS GovCloud, marking a significant expansion into government deployment. The company has crafted contractual exceptions to its general Usage Policy to permit legally authorized foreign intelligence analysis, including combating human trafficking and identifying covert influence campaigns, while maintaining restrictions on disinformation, weapons design, and malicious cyber operations. The deployment is currently limited to ASL-2 models under Anthropic's Responsible Scaling Policy. Anthropic also notes prior pre-release access to Claude 3.5 Sonnet was provided to the UK AI Safety Institute for pre-deployment testing.

AI Safety Research Enterprise Deployment Patterns AWS GovCloud UK Artificial Intelligence Safety Institute Claude 3.5 Sonnet +8 more

6Anthropic News·Jun 4, 2026·source ↗

Anthropic publishes policy brief calling for targeted AI regulation within 18 months

Anthropic published a policy position paper arguing that governments have an 18-month window to enact narrowly-targeted AI regulation before risks in cyber and CBRN domains become acute. The post cites rapid capability gains—SWE-bench scores rising from 1.96% to 49% in a year, GPQA scores approaching human expert level—as evidence that frontier models are approaching meaningful misuse thresholds. Anthropic also reviews its Responsible Scaling Policy as a model for adaptive, proportionate risk governance and calls for similar frameworks to be adopted industry-wide and codified in law.

AI Safety Research Regulatory Developments Anthropic Policy Frontier Red Team Claude 3.5 Sonnet UK AI Security Institute +5 more

6Anthropic News·Jun 3, 2026·source ↗

Anthropic policy recap: US Executive Order, G7 Code of Conduct, and Bletchley Park AI Safety Summit

Anthropic published a policy commentary summarizing three major AI governance events from late October/early November 2023: the US Executive Order on AI, the G7 International Code of Conduct for advanced AI developers, and the UK-hosted Bletchley Park AI Safety Summit. The post covers Anthropic's positions on each, including support for NIST capacity-building, the G7 Code of Conduct, and the newly announced UK and US AI Safety Institutes. Dario Amodei presented Anthropic's Responsible Scaling Policy at Bletchley as a potential regulatory prototype, and the 28-country Bletchley Declaration notably included China among its signatories.

AI Safety Research Regulatory Developments Dario Amodei Trump Administration Executive Order on AI Bletchley Declaration +6 more

9Anthropic News·Jun 3, 2026·source ↗

Anthropic introduces computer use capability, upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku

Anthropic announced three major developments: an upgraded Claude 3.5 Sonnet with significant coding improvements (SWE-bench Verified rising from 33.4% to 49.0%, surpassing all publicly available models including reasoning models), a new Claude 3.5 Haiku that matches Claude 3 Opus performance at Haiku-tier speed, and a public beta of 'computer use' — a capability allowing Claude to control computers by viewing screens, moving cursors, clicking, and typing. Computer use is available via the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI, with early adopters including Replit, The Browser Company, and Cognition. Both safety institutes (US AISI and UK AISI) conducted pre-deployment testing, and the model was assessed as remaining within ASL-2 under Anthropic's Responsible Scaling Policy.

Frontier Model Releases Evaluation and Benchmarking OpenAI o1-preview Amazon Bedrock Claude 3.5 Sonnet +15 more

6Anthropic News·Jun 2, 2026·source ↗

Anthropic signs MOU with UK Government to explore AI transformation of public services

Anthropic signed a Memorandum of Understanding with the UK's Department for Science, Innovation and Technology (DSIT) to explore deploying Claude in UK public services, including government information access and digital service delivery. The partnership will also cover AI supply chain security, R&D collaboration, and workforce adaptation, drawing on Anthropic's Economic Index for labor market insights. Anthropic will continue working with the UK AI Security Institute on capability evaluation and safety. The announcement includes several existing government deployments of Claude as illustrative context.

Enterprise Deployment Patterns Regulatory Developments Palantir UK Department for Science, Innovation and Technology Claude +4 more

8Anthropic News·Jun 2, 2026·source ↗

Introducing Claude 3.5 Sonnet

Anthropic launches Claude 3.5 Sonnet, the first model in its Claude 3.5 family, claiming it outperforms Claude 3 Opus and competitor models on GPQA, MMLU, and HumanEval benchmarks while operating at twice the speed and mid-tier pricing ($3/$15 per million tokens). The model features a 200K context window, improved vision capabilities, and an internal agentic coding evaluation score of 64% versus 38% for Opus. Alongside the model, Anthropic introduces Artifacts on Claude.ai, a dedicated workspace for real-time editing of AI-generated content. The model was pre-deployment evaluated by the UK AI Safety Institute and assessed at ASL-2.

Long Context Evolution Frontier Model Releases claude.ai Thorn Amazon Bedrock +16 more

7Anthropic News·Jun 2, 2026·source ↗

Anthropic Details Collaboration with US CAISI and UK AISI on Constitutional Classifier Red-Teaming

Anthropic has published an account of its ongoing voluntary partnership with the US Center for AI Standards and Innovation (CAISI) and UK AI Security Institute (AISI), in which government red-teamers were given deep access to pre-deployment versions of Constitutional Classifiers used on Claude Opus 4 and 4.1. The collaboration uncovered multiple vulnerability classes including prompt injection bypasses, cipher-based obfuscation attacks, universal jailbreaks via automated attack refinement, and input/output fragmentation exploits, each of which drove architectural improvements to Anthropic's safeguard systems. Key lessons shared include the value of providing unprotected model variants, real-time classifier score access, and detailed internal documentation to enable targeted red-teaming. The announcement frames government partnership as a core component of Anthropic's Safeguards approach rather than a one-off audit.

Frontier Model Releases Evaluation and Benchmarking Constitutional Classifiers prompt injection Claude Opus 4.6 +6 more

6Anthropic News·Jun 1, 2026·source ↗

Anthropic Opens Tokyo Office, Signs AI Safety MoC with Japan AI Safety Institute

Anthropic has officially opened its first Asia-Pacific office in Tokyo, with CEO Dario Amodei meeting Japanese Prime Minister Takaichi and signing a Memorandum of Cooperation with the Japan AI Safety Institute to collaborate on AI evaluation methodologies. The company also joined the Hiroshima AI Process Friends Group and hosted a Builder Summit for 150+ startups. Japanese enterprise deployments of Claude are highlighted across Rakuten, Nomura Research Institute, Panasonic, and Classmethod, with Anthropic reporting 10x run-rate revenue growth in Asia-Pacific over the past year. Expansion to Seoul and Bengaluru is planned for coming months.

Evaluation and Benchmarking AI Safety Research Dario Amodei Center for AI Standards and Innovation Hidetoshi Tojo +15 more

7Anthropic News·Jun 1, 2026·source ↗

Anthropic Partners with UK Government to Deploy Claude-Powered AI Assistant on GOV.UK

Anthropic has been selected by the UK's Department for Science, Innovation and Technology (DSIT) to build and pilot an AI-powered assistant for GOV.UK, initially focused on helping job seekers navigate employment services and training resources. The system is described as agentic, maintaining context across interactions and routing users to appropriate services. The partnership builds on a February 2025 MOU and follows DSIT's 'Scan, Pilot, Scale' phased deployment framework, with Anthropic engineers embedded alongside civil servants at the Government Digital Service. A stated goal is building independent AI and AI safety expertise within the UK government.

AI Safety Research Enterprise Deployment Patterns UK Department for Science, Innovation and Technology Claude UK AI Security Institute +8 more

6The Batch·May 23, 2026·source ↗

Google Study Shows LLM-Generated Malware Is Getting Harder to Track and Stop

A Google security report catalogs emerging LLM-enabled cyberattack techniques including morphing malware with mutation engines, logical-flaw discovery in code, and AI-directed obfuscation networks. The report was prompted in part by a real incident where hackers used an LLM to find a zero-day in a widely used web administration tool. Separately, the UK AI Security Institute found that Claude Mythos Preview and GPT-5.5 can reliably execute attacks expected to take humans 3 hours, up from earlier 1-hour benchmarks, with performance scaling further when token limits are relaxed. The findings suggest an accelerating gap between LLM offensive capability and conventional defensive tooling.

Frontier Model Releases Evaluation and Benchmarking Claude Opus 4.6 Google UK AI Security Institute +8 more

5Openai Blog·May 20, 2026·source ↗

OpenAI Reports Progress with US CAISI and UK AISI on AI Safety and Security

OpenAI has published an update on its ongoing partnership with the US Cyber and AI Safety Institute (CAISI) and the UK AI Safety Institute (AISI). The collaboration focuses on strengthening AI safety and security practices. The announcement signals continued institutional engagement between OpenAI and government AI safety bodies in both countries.

AI Safety Research Regulatory Developments UK AI Security Institute US Cyber and AI Safety Institute OpenAI

6Google Deepmind Blog·May 19, 2026·source ↗

Google DeepMind Deepens Partnership with UK AI Security Institute

Google DeepMind and the UK AI Security Institute (AISI) are strengthening their collaboration on AI safety and security research. The announcement signals an expanded formal relationship between a leading frontier lab and a government-backed AI safety body. Specific research areas and deliverables were not detailed in the available text, but the partnership focuses on critical safety and security topics.

Evaluation and Benchmarking AI Safety Research UK AI Security Institute Google DeepMind +1 more

6The Batch·May 18, 2026·source ↗

Anthropic Passes OpenAI in Business Adoption; Cerebras IPO; Claude Mythos Security Concerns

A Ramp AI Index survey shows Anthropic reached 34.4% business adoption in April 2026, surpassing OpenAI's 32.3%, though analysts cite token cost inflation, service degradation, and competition from cheaper inference platforms as threats to the lead. Cerebras surged 89% on its IPO debut, signaling investor appetite for AI infrastructure hardware. Separately, Anthropic's withheld Claude Mythos model—which solved a novel cybersecurity challenge—prompted meetings with the Financial Stability Board, while ArXiv announced year-long bans for authors submitting unvetted AI-generated content.

Training Infrastructure Frontier Model Releases Financial Stability Board Claude Mythos UK AI Security Institute +14 more