Entity · model

o4-mini-high

modelactiveo4-mini-high-61ca8851·1 events·first seen Jun 10, 2026

Aliases: o4-mini-high

Co-occurring entities

More like this (12)

o4-mini o3-mini o1-mini o3 and o4-mini system card Phi-4-mini GPT-4o mini OpenAI o3-mini MiniF2F o1 o3 gpt-audio-mini ms-marco-MiniLM-L-6-v2

Recent events (1)

8arXiv · cs.AI·Jun 10, 2026·source ↗

ABC-Bench: Agentic biosecurity benchmark finds LLM agents surpass median expert humans on dual-use biology tasks

Researchers introduce ABC-Bench, a benchmark evaluating LLM agents on biosecurity-relevant biology tasks including liquid-handling robot programming, DNA fragment design, and evasion of DNA synthesis screening. All tested agents outperformed the median expert human baseline across all three tasks. Wet-lab validation confirmed that OpenAI's o4-mini-high produced scripts that successfully assembled DNA on an OpenTrons robot. The results highlight a meaningful shift in the biosecurity risk landscape as AI agents acquire practical wet-lab-adjacent capabilities.

Frontier Model Releases Evaluation and Benchmarking ABC-Bench OpenTrons o4-mini-high +2 more