Entity · technique

adversarial robustness

techniqueactiveadversarial-robustness-ba26e331·5 events·first seen May 20, 2026

Aliases: adversarial robustness

Co-occurring entities

OpenAI adversarial examples Robust Classification L-infinity perturbation adversarial training L2 perturbation UAR (Unforeseen Attack Robustness)inference-time compute scaling

More like this (12)

Latent Adversarial Robustification (LAR)adversarial refinement UAR (Unforeseen Attack Robustness)adversarial examples Robust Classification Evaluation of Adversarial Robustness in Arabic Language Models adversarial training distributionally robust optimization Exploring Adversarial Robustness and Safety Alignment in Multilingual Multi-Modal Large Language Models Adversarial Attacks on Neural Network Policies adversarial pragmatics black-box adversarial attacks

Recent events (5)

3Openai Blog·May 20, 2026·source ↗

Attacking Machine Learning with Adversarial Examples

This 2017 OpenAI blog post introduces adversarial examples — inputs intentionally crafted to cause machine learning models to make mistakes, analogized to optical illusions for machines. It surveys how adversarial examples manifest across different input modalities and discusses the fundamental difficulties in defending against them. The post is an early foundational explainer on adversarial robustness from OpenAI.

AI Safety Research adversarial examples adversarial robustness OpenAI

4Openai Blog·May 20, 2026·source ↗

Computational limitations in robust classification and win-win results

OpenAI published research examining computational limitations in robust classification, exploring theoretical bounds on adversarially robust machine learning. The work investigates so-called 'win-win' results where both standard and robust accuracy can be achieved simultaneously. This is a foundational safety and robustness research contribution from 2019, addressing hardness results in adversarial ML.

Evaluation and Benchmarking AI Safety Research adversarial robustness Robust Classification OpenAI

4Openai Blog·May 20, 2026·source ↗

Transfer of Adversarial Robustness Between Perturbation Types

OpenAI published research examining whether adversarial robustness trained against one type of perturbation (e.g., L-infinity) transfers to other perturbation types (e.g., L2, L1). The work investigates the generalization properties of adversarial training across different threat models. This is an early safety and robustness research contribution from OpenAI predating the modern LLM era.

Evaluation and Benchmarking AI Safety Research adversarial robustness L-infinity perturbation adversarial training +2 more

4Openai Blog·May 20, 2026·source ↗

Testing Robustness Against Unforeseen Adversaries

OpenAI published a method to evaluate whether neural network classifiers can defend against adversarial attacks not encountered during training. The approach introduces a new metric called UAR (Unforeseen Attack Robustness) to quantify a model's resilience to unanticipated attacks. The work argues for measuring robustness across a broader, more diverse set of attack types rather than only those seen in training.

Evaluation and Benchmarking AI Safety Research adversarial robustness OpenAI UAR (Unforeseen Attack Robustness)

6Openai Blog·May 20, 2026·source ↗

Trading Inference-Time Compute for Adversarial Robustness

OpenAI published research exploring the trade-off between inference-time compute and adversarial robustness. The work investigates whether allocating more compute at inference time can improve a model's resistance to adversarial attacks. This connects to the broader trend of using test-time compute scaling as a lever for capability and safety improvements.

Evaluation and Benchmarking Inference Economics adversarial robustness inference-time compute scaling OpenAI +1 more