Entity · model

o1-preview

modelactiveo1-preview-e6117946·2 events·first seen May 18, 2026

Aliases: o1-preview

Co-occurring entities

Kaggle MLE-bench OpenAI DeepSeek V4 AIME MATH DeepSeek-R1-Lite-Preview

More like this (12)

OpenAI o1-preview o1 o1-mini o3 o1 System Card QwQ-Max-Preview o3-mini QwQ-32B-Preview QVQ-72B-Preview Mythos Preview o4-mini Qwen3.7-Plus-Preview

Recent events (2)

6Openai Blog·May 20, 2026·source ↗

MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering

OpenAI introduces MLE-bench, a benchmark designed to measure AI agent performance on machine learning engineering tasks. The benchmark draws from Kaggle competitions to evaluate agents on realistic ML engineering workflows. Initial results show that current agents, including those powered by o1-preview, achieve competitive performance on a subset of tasks but fall well short of top human competitors. The benchmark is intended to track progress in agentic ML capabilities over time.

Frontier Model Releases Evaluation and Benchmarking Kaggle o1-preview MLE-bench +2 more

7Deepseek News·May 18, 2026·source ↗

DeepSeek-R1-Lite-Preview Launched with o1-Level Reasoning Performance

DeepSeek has released DeepSeek-R1-Lite-Preview, a reasoning-focused model claiming o1-preview-level performance on AIME and MATH benchmarks. The model features a transparent, real-time chain-of-thought process and demonstrates inference scaling behavior where longer reasoning chains yield better results. DeepSeek has indicated that open-source model weights and a full API are forthcoming. The model is currently accessible via chat.deepseek.com.

Frontier Model Releases Evaluation and Benchmarking o1-preview DeepSeek V4 AIME +4 more