Almanac
organization

UK AI Security Institute

organizationactiveuk-ai-security-institute-f605d5ee·14 events·first seen 29d ago

Aliases: UK AI Safety Institute, UK AI Security Institute, AI Security Institute (AISI), US AI Safety Institute

Merged from

UK AI Safety Institute

Co-occurring entities

More like this (12)

Recent events (14)

6Google Deepmind Blog·28d ago·source ↗

Google DeepMind Deepens Partnership with UK AI Security Institute

Google DeepMind and the UK AI Security Institute (AISI) are strengthening their collaboration on AI safety and security research. The announcement signals an expanded formal relationship between a leading frontier lab and a government-backed AI safety body. Specific research areas and deliverables were not detailed in the available text, but the partnership focuses on critical safety and security topics.

7Anthropic News·15d ago·source ↗

Anthropic Details Collaboration with US CAISI and UK AISI on Constitutional Classifier Red-Teaming

Anthropic has published an account of its ongoing voluntary partnership with the US Center for AI Standards and Innovation (CAISI) and UK AI Security Institute (AISI), in which government red-teamers were given deep access to pre-deployment versions of Constitutional Classifiers used on Claude Opus 4 and 4.1. The collaboration uncovered multiple vulnerability classes including prompt injection bypasses, cipher-based obfuscation attacks, universal jailbreaks via automated attack refinement, and input/output fragmentation exploits, each of which drove architectural improvements to Anthropic's safeguard systems. Key lessons shared include the value of providing unprotected model variants, real-time classifier score access, and detailed internal documentation to enable targeted red-teaming. The announcement frames government partnership as a core component of Anthropic's Safeguards approach rather than a one-off audit.

6The Batch·24d ago·source ↗

Google Study Shows LLM-Generated Malware Is Getting Harder to Track and Stop

A Google security report catalogs emerging LLM-enabled cyberattack techniques including morphing malware with mutation engines, logical-flaw discovery in code, and AI-directed obfuscation networks. The report was prompted in part by a real incident where hackers used an LLM to find a zero-day in a widely used web administration tool. Separately, the UK AI Security Institute found that Claude Mythos Preview and GPT-5.5 can reliably execute attacks expected to take humans 3 hours, up from earlier 1-hour benchmarks, with performance scaling further when token limits are relaxed. The findings suggest an accelerating gap between LLM offensive capability and conventional defensive tooling.

6Anthropic News·13d ago·source ↗

Anthropic policy recap: US Executive Order, G7 Code of Conduct, and Bletchley Park AI Safety Summit

Anthropic published a policy commentary summarizing three major AI governance events from late October/early November 2023: the US Executive Order on AI, the G7 International Code of Conduct for advanced AI developers, and the UK-hosted Bletchley Park AI Safety Summit. The post covers Anthropic's positions on each, including support for NIST capacity-building, the G7 Code of Conduct, and the newly announced UK and US AI Safety Institutes. Dario Amodei presented Anthropic's Responsible Scaling Policy at Bletchley as a potential regulatory prototype, and the 28-country Bletchley Declaration notably included China among its signatories.

5Openai Blog·27d ago·source ↗

OpenAI Reports Progress with US CAISI and UK AISI on AI Safety and Security

OpenAI has published an update on its ongoing partnership with the US Cyber and AI Safety Institute (CAISI) and the UK AI Safety Institute (AISI). The collaboration focuses on strengthening AI safety and security practices. The announcement signals continued institutional engagement between OpenAI and government AI safety bodies in both countries.

6The Batch·29d ago·source ↗

Anthropic Passes OpenAI in Business Adoption; Cerebras IPO; Claude Mythos Security Concerns

A Ramp AI Index survey shows Anthropic reached 34.4% business adoption in April 2026, surpassing OpenAI's 32.3%, though analysts cite token cost inflation, service degradation, and competition from cheaper inference platforms as threats to the lead. Cerebras surged 89% on its IPO debut, signaling investor appetite for AI infrastructure hardware. Separately, Anthropic's withheld Claude Mythos model—which solved a novel cybersecurity challenge—prompted meetings with the Financial Stability Board, while ArXiv announced year-long bans for authors submitting unvetted AI-generated content.

8Anthropic News·14d ago·source ↗

Introducing Claude 3.5 Sonnet

Anthropic launches Claude 3.5 Sonnet, the first model in its Claude 3.5 family, claiming it outperforms Claude 3 Opus and competitor models on GPQA, MMLU, and HumanEval benchmarks while operating at twice the speed and mid-tier pricing ($3/$15 per million tokens). The model features a 200K context window, improved vision capabilities, and an internal agentic coding evaluation score of 64% versus 38% for Opus. Alongside the model, Anthropic introduces Artifacts on Claude.ai, a dedicated workspace for real-time editing of AI-generated content. The model was pre-deployment evaluated by the UK AI Safety Institute and assessed at ASL-2.

7Anthropic News·15d ago·source ↗

Anthropic Partners with UK Government to Deploy Claude-Powered AI Assistant on GOV.UK

Anthropic has been selected by the UK's Department for Science, Innovation and Technology (DSIT) to build and pilot an AI-powered assistant for GOV.UK, initially focused on helping job seekers navigate employment services and training resources. The system is described as agentic, maintaining context across interactions and routing users to appropriate services. The partnership builds on a February 2025 MOU and follows DSIT's 'Scan, Pilot, Scale' phased deployment framework, with Anthropic engineers embedded alongside civil servants at the Government Digital Service. A stated goal is building independent AI and AI safety expertise within the UK government.

6Anthropic News·14d ago·source ↗

Anthropic signs MOU with UK Government to explore AI transformation of public services

Anthropic signed a Memorandum of Understanding with the UK's Department for Science, Innovation and Technology (DSIT) to explore deploying Claude in UK public services, including government information access and digital service delivery. The partnership will also cover AI supply chain security, R&D collaboration, and workforce adaptation, drawing on Anthropic's Economic Index for labor market insights. Anthropic will continue working with the UK AI Security Institute on capability evaluation and safety. The announcement includes several existing government deployments of Claude as illustrative context.

9Anthropic News·14d ago·source ↗

Anthropic introduces computer use capability, upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku

Anthropic announced three major developments: an upgraded Claude 3.5 Sonnet with significant coding improvements (SWE-bench Verified rising from 33.4% to 49.0%, surpassing all publicly available models including reasoning models), a new Claude 3.5 Haiku that matches Claude 3 Opus performance at Haiku-tier speed, and a public beta of 'computer use' — a capability allowing Claude to control computers by viewing screens, moving cursors, clicking, and typing. Computer use is available via the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI, with early adopters including Replit, The Browser Company, and Cognition. Both safety institutes (US AISI and UK AISI) conducted pre-deployment testing, and the model was assessed as remaining within ASL-2 under Anthropic's Responsible Scaling Policy.

6Anthropic News·15d ago·source ↗

Anthropic Opens Tokyo Office, Signs AI Safety MoC with Japan AI Safety Institute

Anthropic has officially opened its first Asia-Pacific office in Tokyo, with CEO Dario Amodei meeting Japanese Prime Minister Takaichi and signing a Memorandum of Cooperation with the Japan AI Safety Institute to collaborate on AI evaluation methodologies. The company also joined the Hiroshima AI Process Friends Group and hosted a Builder Summit for 150+ startups. Japanese enterprise deployments of Claude are highlighted across Rakuten, Nomura Research Institute, Panasonic, and Classmethod, with Anthropic reporting 10x run-rate revenue growth in Asia-Pacific over the past year. Expansion to Seoul and Bengaluru is planned for coming months.

6Anthropic News·12d ago·source ↗

Anthropic publishes policy brief calling for targeted AI regulation within 18 months

Anthropic published a policy position paper arguing that governments have an 18-month window to enact narrowly-targeted AI regulation before risks in cyber and CBRN domains become acute. The post cites rapid capability gains—SWE-bench scores rising from 1.96% to 49% in a year, GPQA scores approaching human expert level—as evidence that frontier models are approaching meaningful misuse thresholds. Anthropic also reviews its Responsible Scaling Policy as a model for adaptive, proportionate risk governance and calls for similar frameworks to be adopted industry-wide and codified in law.

9Anthropic News·4d ago·source ↗

US government orders Anthropic to suspend access to Fable 5 and Mythos 5 citing national security jailbreak concerns

The US government issued an export control directive requiring Anthropic to immediately disable Fable 5 and Mythos 5 for all foreign nationals, effectively forcing a full customer suspension to ensure compliance. The government cited awareness of a jailbreak method, but Anthropic disputes the severity, stating the demonstrated technique is a narrow, non-universal jailbreak that produces results already achievable by other publicly available models including GPT-5.5. Anthropic is complying with the directive while publicly disagreeing with the standard applied, arguing that requiring perfect jailbreak resistance would halt all frontier model deployments industry-wide. This is a significant regulatory and safety governance flashpoint involving government authority over commercial AI model access.

7Anthropic News·12d ago·source ↗

Anthropic makes Claude 3 Haiku and Sonnet available to US Intelligence Community and AWS GovCloud

Anthropic has made Claude 3 Haiku and Claude 3 Sonnet available via AWS Marketplace for the US Intelligence Community and AWS GovCloud, marking a significant expansion into government deployment. The company has crafted contractual exceptions to its general Usage Policy to permit legally authorized foreign intelligence analysis, including combating human trafficking and identifying covert influence campaigns, while maintaining restrictions on disinformation, weapons design, and malicious cyber operations. The deployment is currently limited to ASL-2 models under Anthropic's Responsible Scaling Policy. Anthropic also notes prior pre-release access to Claude 3.5 Sonnet was provided to the UK AI Safety Institute for pre-deployment testing.