OpenAI o1-preview
openai-o1-preview-33aeaaea·2 events·first seen 28d agoAliases: OpenAI o1-preview
Co-occurring entities
More like this (12)
Recent events (2)
Introducing OpenAI o1
OpenAI announced o1, a new series of AI models designed to spend more time 'thinking' before responding, using chain-of-thought reasoning to tackle complex problems in science, coding, and mathematics. The o1-preview and o1-mini models are being released, with o1-preview representing the most capable version and o1-mini offering a faster, cheaper alternative optimized for coding and reasoning tasks. OpenAI claims o1-preview ranks in the 89th percentile on competitive programming problems and performs at a PhD level on science benchmarks. This release marks a significant shift in OpenAI's approach to scaling, moving from purely training-time compute to inference-time compute as a new axis of capability improvement.
Anthropic introduces computer use capability, upgraded Claude 3.5 Sonnet, and Claude 3.5 Haiku
Anthropic announced three major developments: an upgraded Claude 3.5 Sonnet with significant coding improvements (SWE-bench Verified rising from 33.4% to 49.0%, surpassing all publicly available models including reasoning models), a new Claude 3.5 Haiku that matches Claude 3 Opus performance at Haiku-tier speed, and a public beta of 'computer use' — a capability allowing Claude to control computers by viewing screens, moving cursors, clicking, and typing. Computer use is available via the Anthropic API, Amazon Bedrock, and Google Cloud Vertex AI, with early adopters including Replit, The Browser Company, and Cognition. Both safety institutes (US AISI and UK AISI) conducted pre-deployment testing, and the model was assessed as remaining within ASL-2 under Anthropic's Responsible Scaling Policy.