Entity · benchmark

CursorBench

benchmarkactivecursorbench-a1272d07·3 events·first seen May 28, 2026

Aliases: CursorBench, CursorBench 3.2

Co-occurring entities

More like this (12)

SelectBench CursorBench v3.1 TriggerBench SorryBench FoldBench LiveBench RepoBench CharacterBench EdgeBench DeliveryBench SkillsBench ProgramBench

Recent events (3)

9Hacker News·5d ago·source ↗

Anthropic releases Claude Opus 5

Anthropic has announced Claude Opus 5, a new flagship model release. The item originates from Anthropic's official news domain, indicating a primary source announcement. This would represent a significant step beyond the current Claude Opus 4.8 flagship and is likely to be a major frontier model release.

Frontier Model Releases Inference Economics Zapier Claude Max Claude Opus 4.6 +15 more

6The Batch·Jun 12, 2026·source ↗

Cursor's Composer 2.5 rivals GPT-5.5 and Claude Opus 4.7 on coding benchmarks at lower cost

Cursor released Composer 2.5, a specialized agentic coding model built on Moonshot's Kimi K2.5 open weights with additional pretraining and reinforcement learning fine-tuning tailored to Cursor's own CLI harness. The model ranks third on the Artificial Analysis Coding Agent Index behind Claude Opus 4.7 and GPT-5.5 at max reasoning, but significantly undercuts them on cost ($0.44 vs $4.14 per task) and speed (6.7 vs 17.7 minutes). The training approach—co-optimizing model and harness together using synthetic tasks, text feedback during RL, and 25x more synthetic data than Composer 2—illustrates a specialist model strategy that challenges the dominance of generalist frontier models in coding workflows.

Frontier Model Releases Inference Economics SWE-Bench-Pro-Hard-AA Claude Opus 4.6 SpaceX +12 more

8Hacker News·May 28, 2026·source ↗

Claude Opus 4.8 Released by Anthropic

Anthropic has released Claude Opus 4.8, a new frontier model in their Claude lineup. The announcement appeared on Anthropic's official news page and generated significant community engagement on Hacker News with over 1,000 points and 800+ comments. Specific capability details and benchmarks are not available from the source snippet alone.

Frontier Model Releases Evaluation and Benchmarking claude.ai Claude Opus 4.6 Databricks +16 more