
Devin
devin-63d8e29c·7 events·first seen 1mo agoAliases: Devin
Co-occurring entities
More like this (12)
Recent events (7)
The Age of Async Agents — Cognition's Walden Yan & OpenInspect's Cole Murray
A Latent Space podcast episode featuring Cognition's Walden Yan and OpenInspect's Cole Murray discussing the current state of autonomous software engineering agents. Topics include Devin's reported 80% commit rate, spec-to-PR workflows, full VM environments for agents, agent memory, and the emerging pattern of product managers shipping code directly. The conversation covers practical deployment patterns and tooling for async agentic coding workflows.
Cognition raises $1B in $26B Series D
Cognition, the AI coding agent company behind Devin, has raised $1B in a Series D round at a $26B valuation. The round signals continued investor conviction in autonomous coding agents as a large and growing market. The Latent Space newsletter frames coding as an 'uncapped TAM market,' reflecting broader industry sentiment around AI-driven software development.
Empirical study finds 80% of AI agent-authored test patches lack meaningful verification logic
A large-scale empirical study of 86,156 test-file patches from 33,596 agent-authored GitHub PRs finds that 80.2% contain weak or no explicit oracle signals — meaning they execute code without verifying behavior. The study covers five coding agents (OpenAI Codex, GitHub Copilot, Devin, Cursor, and Claude Code) across 2,807 repositories, and introduces a syntactic taxonomy of eight oracle signal categories. Despite lower raw merge rates, regression analysis shows strong oracles significantly improve merge likelihood (OR=1.28), suggesting current quality gates based on test-file presence substantially overestimate verification strength.
Anthropic Releases Claude Sonnet 4.5: Top Coding and Computer-Use Model with Agent SDK
Anthropic has released Claude Sonnet 4.5, claiming it is the best coding model and strongest model for building complex agents, with a 61.4% score on OSWorld (up from 42.2% for Sonnet 4) and state-of-the-art performance on SWE-bench Verified. The release is accompanied by major product upgrades including checkpoints in Claude Code, a native VS Code extension, a Claude Agent SDK giving developers access to the same infrastructure powering Claude Code, and new context editing and memory tools in the Claude API. Pricing is unchanged from Sonnet 4 at $3/$15 per million input/output tokens. Early enterprise customers including Cursor, GitHub Copilot, Devin, Canva, and Figma report significant gains in coding, agentic, and long-context tasks.
Anthropic Releases Claude Opus 4.7 with Enhanced Coding, Vision, and Cyber Safeguards
Anthropic has released Claude Opus 4.7, a general-availability model positioned as a meaningful improvement over Opus 4.6 in advanced software engineering, long-horizon agentic tasks, and vision capabilities including higher image resolution. The model is notably the first to receive new cybersecurity safeguards developed in response to Project Glasswing, with automatic detection and blocking of prohibited cyber uses and a new Cyber Verification Program for legitimate security professionals. Opus 4.7 is available across Claude products, API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry at the same pricing as Opus 4.6 ($5/$25 per million input/output tokens). The release is explicitly positioned below Claude Mythos Preview in overall capability, serving as a testbed for safety mechanisms before broader deployment of Mythos-class models.
Coding with OpenAI o1
OpenAI published a brief feature in which Scott Wu, CEO of Cognition (maker of the Devin AI software engineer), describes how o1 approaches coding decisions in a more human-like, reasoning-oriented manner. The piece is a short promotional commentary tied to the o1 model launch, highlighting o1's potential impact on AI-assisted software development. No new technical benchmarks or capability details are disclosed.
Claude Opus 4.8 Released by Anthropic
Anthropic has released Claude Opus 4.8, a new frontier model in their Claude lineup. The announcement appeared on Anthropic's official news page and generated significant community engagement on Hacker News with over 1,000 points and 800+ comments. Specific capability details and benchmarks are not available from the source snippet alone.