6Google DeepMind Blog·1mo ago

Protecting People from Harmful Manipulation

Google DeepMind has published research examining AI's potential for harmful manipulation across domains including finance and health. The work identifies manipulation risks and proposes new safety measures to address them. This represents a proactive safety research effort from a Tier 1 lab focused on misuse and adversarial deployment scenarios.

AI Safety Research Alignment and RLHF Google DeepMind

Related guides (3)

Google DeepMind

Google DeepMind: The Lab Behind Gemini, AlphaFold, and Frontier AI

Read asBeginner In-depth

AI Safety ResearchTopic guide

AI Safety Research: From Lab Policies to Real-World Flashpoints

Read asBeginner In-depth

Alignment and RLHFTopic guide

Alignment and RLHF: Teaching AI Models to Behave

Read asBeginner In-depth

Related events (8)

6Mit Technology Review — Ai·9d ago·source ↗

Google DeepMind funds research into risks of large-scale multi-agent interaction

Google DeepMind is funding research into the safety risks that emerge when millions of AI agents interact with each other online without human oversight. Rohin Shah, who directs AGI safety and alignment research at DeepMind, is cited as the source. The concern centers on emergent behaviors and coordination dynamics that could arise at mass-market agent deployment scale.

AI Safety Research Agent and Tool Ecosystem Rohin Shah Google DeepMind MIT Technology Review

5Openai Blog·1mo ago·source ↗

Preparing for malicious uses of AI

OpenAI co-authored a multi-institutional paper forecasting how malicious actors could misuse AI technology, produced in collaboration with FHI, CSER, CNAS, EFF, and others over nearly a year. The paper outlines potential threat vectors and proposes prevention and mitigation strategies. This represents an early coordinated effort among AI safety and policy organizations to systematically address AI misuse risks.

AI Safety Research Regulatory Developments Center for a New American Security Centre for the Study of Existential Risk Electronic Frontier Foundation +3 more

6Google Deepmind Blog·9d ago·source ↗

Google DeepMind launches $10M funding call for multi-agent AI safety research

Google DeepMind and unnamed partners have announced a $10M funding call targeting safety research for multi-agent AI systems. The initiative signals institutional recognition that multi-agent architectures present distinct safety challenges requiring dedicated research investment. This is a notable funding commitment from a tier-1 lab directed specifically at an underexplored safety domain.

AI Safety Research Agent and Tool Ecosystem Google DeepMind

5Google Deepmind Blog·1mo ago·source ↗

Taking a Responsible Path to AGI

DeepMind published a blog post outlining its approach to AGI development, emphasizing technical safety, proactive risk assessment, and collaboration with the broader AI community. The post signals DeepMind's public positioning on responsible AGI development practices. It appears to be a high-level strategic communication rather than a technical disclosure or specific capability announcement.

Frontier Model Releases AI Safety Research AGI Google DeepMind

6Google Deepmind Blog·1mo ago·source ↗

DeepMind Publishes Framework for Evaluating Cybersecurity Threats of Advanced AI

DeepMind has released a framework designed to help cybersecurity experts assess and prioritize defenses against potential threats posed by advanced AI systems. The framework aims to systematically identify which defensive measures are necessary given AI's expanding capabilities in offensive cyber operations. This represents DeepMind's structured approach to evaluating AI-enabled cyber risks before they materialize at scale.

Evaluation and Benchmarking AI Safety Research DeepMind AI Cybersecurity Threat Evaluation Framework

6Google Deepmind Blog·2d ago·source ↗

DeepMind publishes AI Control Roadmap for securing internal agentic systems

DeepMind released a blog post outlining an AI Control Roadmap aimed at securing internal systems that use AI agents. The approach combines traditional security safeguards with real-time monitoring. The announcement signals DeepMind's formal internal posture on agentic AI safety and control.

AI Safety Research Agent and Tool Ecosystem Google DeepMind AI Control Roadmap

6Google Deepmind Blog·1mo ago·source ↗

Strengthening our Frontier Safety Framework

Google DeepMind has announced updates to its Frontier Safety Framework (FSF), aimed at better identifying and mitigating severe risks from advanced AI models. The announcement comes from a Tier 1 lab and signals continued evolution of internal safety governance structures. The body is brief and lacks technical specifics, but the update to a named safety framework from a major lab is substantively trackable.

Frontier Model Releases AI Safety Research Frontier Safety Framework Google DeepMind

5Openai Blog·1mo ago·source ↗

Disrupting Malicious Uses of AI: OpenAI June 2025 Report

OpenAI published its June 2025 report on detecting and preventing malicious uses of its AI systems. The report features case studies of threat actors attempting to abuse OpenAI's models and the countermeasures deployed. This is part of OpenAI's ongoing transparency series on adversarial misuse.

AI Safety Research Regulatory Developments OpenAI