Almanac
Guide · Beginner

Codex: OpenAI's AI Coding Agent

CodexBeginneractive·v1 · live·generated 6d ago

Part of these paths

TL;DRCodex is OpenAI's AI-powered coding agent — a product that can write, fix, and ship software on your behalf, running tasks autonomously in the cloud while you stay in the loop. It started as a research model for code generation, evolved into a full agentic platform, and has grown into one of OpenAI's flagship enterprise products, now used by millions of developers and embedded across major cloud providers.

Key takeaways

  • Codex traces its roots to a 2021 research paper that introduced the HumanEval benchmark and the pass@k metric for measuring code quality — foundational tools still used across the field.
  • The modern Codex product launched in May 2025 and is powered by codex-1, a variant of o3 fine-tuned for software engineering via reinforcement learning on real-world coding tasks.
  • As of early 2026, Codex has 4 million weekly active users and is recognized as a Leader in Gartner's 2026 Magic Quadrant for Enterprise AI Coding Agents.
  • Codex is available on AWS, Oracle Cloud, Cloudflare Agent Cloud, and in hybrid/on-premise environments via a Dell partnership — plus natively in ChatGPT and on mobile.
  • OpenAI acquired Astral (makers of the Ruff linter and uv package manager) and announced plans to acquire Ona (persistent cloud environments) to deepen Codex's developer tooling stack.
  • Codex Security, a spin-off product in research preview, applies the same agentic approach to finding and patching application vulnerabilities.

What Codex is

Codex is OpenAI's AI coding agent — a product that can take a software task (write a feature, fix a bug, scan for security holes) and work through it autonomously, step by step, in a sandboxed cloud environment. It is not a code-completion tool that suggests the next line as you type; it is closer to a software engineer you can assign a whole ticket to and check back in with later.

The name "Codex" has a longer history than the current product. Back in 2021, OpenAI published a research paper introducing an early Codex model and the HumanEval benchmark — a test for measuring whether AI-generated code actually runs correctly. That paper became a landmark reference for the entire field of AI code generation. A follow-up technique called fill-in-the-middle (FIM), published in 2022, taught models to complete code given both what comes before and after a gap, which is exactly how modern code editors work. Both ideas quietly shaped the tools developers use today.

The modern product

The Codex you can use today launched in May 2025. Under the hood it runs on codex-1, a variant of OpenAI's o3 model that was fine-tuned specifically for software engineering using reinforcement learning on real-world coding tasks. The goal was a model that writes code the way a human engineer would: matching the style of the existing codebase, following instructions precisely, and iterating on tests until they pass.

Since launch, OpenAI has released a succession of more capable Codex-native models — GPT-5-Codex, GPT-5.1-Codex-Max, GPT-5.3-Codex — each designed to handle longer, more complex tasks with less hand-holding. The latest versions of the platform run on GPT-5.4 and GPT-5.5.

How it works (the basics)

Think of Codex as having three layers:

1. The model — the AI brain that reads your codebase, reasons about the task, and writes the code. 2. The agent loop — the machinery that lets the model use tools (run tests, read files, call APIs), check its own work, and keep going across many steps. OpenAI has published technical details on how this loop is built on top of its Responses API, using WebSockets and connection-scoped caching to keep things fast. 3. The sandbox — a secure, isolated environment where the agent's code actually runs, with controlled file access and network restrictions so it can't do anything it shouldn't.

You interact with Codex through a desktop app (macOS and Windows), the ChatGPT web interface, or the ChatGPT mobile app — meaning you can kick off a task at your desk and approve the result from your phone.

What it can do now

Codex has grown well beyond pure coding. The desktop app now includes computer use (the agent can operate your machine), in-app browsing, image generation, persistent memory, and plugin support. OpenAI also released Symphony, an open-source specification for connecting issue trackers to always-on Codex agents, so engineering teams can route tickets directly to an AI that works through them in the background.

A separate product, Codex Security, applies the same agentic approach to application security — analyzing a project's context to detect, validate, and suggest patches for vulnerabilities, with the goal of fewer false alarms than traditional scanning tools.

For non-engineers, OpenAI has added plugins, sites, and annotation tools aimed at analysts, marketers, designers, and investors, positioning Codex as a general productivity agent rather than a purely developer-facing tool.

Who's using it and where

Codex has reached 4 million weekly active users and was named a Leader in Gartner's 2026 Magic Quadrant for Enterprise AI Coding Agents. OpenAI has launched Codex Labs and formed partnerships with major consulting firms — Accenture, PwC, and Infosys — to embed Codex across enterprise software development lifecycles.

On the infrastructure side, Codex is available through AWS, Oracle Cloud, and Cloudflare's Agent Cloud platform, and a partnership with Dell brings it to hybrid and on-premise environments for organizations that can't put sensitive code in a public cloud.

Real-world case studies include Wasmer (which used Codex with GPT-5.5 to build a Node.js edge runtime reportedly 10–20x faster than traditional development) and a self-improving tax agent built with Thrive and Crete.

The bigger picture: acquisitions and infrastructure

OpenAI is building out the full stack around Codex. It acquired Astral — the company behind the Ruff Python linter and the uv package manager — to accelerate Codex's Python developer tooling. It also announced plans to acquire Ona, which provides persistent cloud environments, to give Codex agents a stable home for long-running enterprise workflows.

These moves signal that OpenAI sees Codex not just as a model feature but as a platform — one it intends to own end-to-end, from the AI brain to the runtime environment to the developer tools that surround it.

Where it's heading

The trajectory is clear: Codex is moving from a tool that helps individual developers to infrastructure that runs software development workflows at organizational scale. The combination of more capable models, persistent cloud environments, enterprise cloud distribution, and an expanding plugin ecosystem points toward a future where Codex is less a product you open and more a system that runs in the background — handling tickets, catching bugs, and shipping code while your team focuses on the decisions that still need a human.

Codex: From Research Roots to Agentic Platform

Timeline

  1. Original Codex research paper published; HumanEval benchmark introduced

  2. Fill-in-the-middle (FIM) training technique published, later incorporated into Codex

  3. Modern Codex product launches; codex-1 model and system card released

  4. GPT-5-Codex released; Codex upgrades extend to terminal, IDE, web, and mobile

  5. GPT-5.1-Codex-Max released for agentic, project-scale coding

  6. GPT-5.3-Codex announced as a Codex-native agent with general reasoning

  7. OpenAI acquires Astral (Ruff, uv) to accelerate Codex Python tooling

  8. Codex Labs launched; 4M weekly active users reported; Accenture, PwC, Infosys partnerships announced

  9. OpenAI announces acquisition of Ona for persistent cloud agent environments

Related topics

FAQ

What exactly does Codex do?

Codex is an AI agent that can write, debug, test, and ship code on your behalf. You give it a task — fix this bug, add this feature, scan this codebase for vulnerabilities — and it works through the steps autonomously in a sandboxed cloud environment, checking in with you when it needs approval.

Is Codex just a code-completion tool like GitHub Copilot?

No — it goes further. Traditional code completion suggests the next line as you type; Codex runs multi-step tasks independently, manages files, executes tests, and can work for extended periods without constant supervision. Think of it less like autocomplete and more like a junior developer you can assign a whole ticket to.

What model powers Codex?

Codex has used a series of purpose-built models: codex-1 (a fine-tuned variant of o3) at launch, followed by GPT-5-Codex, GPT-5.1-Codex-Max, GPT-5.3-Codex, and GPT-5.4/5.5 as the platform matured. Each generation added more reasoning depth and longer autonomous run times.

Where can I use Codex?

Codex is available through OpenAI's own apps (ChatGPT, desktop, mobile), and through major cloud providers including AWS, Oracle Cloud, and Cloudflare. Dell has also partnered with OpenAI to bring Codex to hybrid and on-premise enterprise environments.

Is Codex only for software engineers?

It started that way, but OpenAI has expanded Codex with plugins, sites, and annotation tools aimed at analysts, marketers, designers, and investors — positioning it as a broader productivity agent, not just a developer tool.

How does Codex handle safety and security?

Codex runs inside a sandboxed environment with controlled file access and network restrictions. OpenAI has published details on its internal security architecture — covering human approval workflows, network policies, and agent telemetry — and has launched a separate Codex Security product specifically for finding and patching application vulnerabilities.

Stay current

Call Me Almanac pairs the week's AI news with guides like this one — Midweek & Sunday.

Versions

  • v1live6d ago

Related guides (4)

More on Codex (6)

7Openai Blog·1mo ago·source ↗

Scaling Codex to enterprises worldwide

OpenAI is launching Codex Labs and forming partnerships with major consulting and IT firms including Accenture, PwC, and Infosys to accelerate enterprise adoption of Codex across the software development lifecycle. The announcement reports 4 million weekly active users for Codex. This represents a significant push to embed OpenAI's coding AI into large-scale enterprise workflows through established system integrators.

6Openai Blog·1mo ago·source ↗

Codex for (almost) everything: OpenAI expands Codex app with computer use, browsing, image generation, memory, and plugins

OpenAI has updated its Codex desktop application for macOS and Windows with a broad set of new capabilities including computer use, in-app browsing, image generation, persistent memory, and plugin support. The update positions Codex as a more comprehensive agentic developer tool rather than a pure code-completion assistant. These additions bring Codex closer to a general-purpose AI agent environment targeting developer workflows.

6Openai Blog·1mo ago·source ↗

Introducing upgrades to Codex

OpenAI has announced upgrades to Codex, its AI coding agent, improving speed, reliability, and real-time collaboration capabilities. The updates extend Codex's reach across multiple development environments including terminal, IDE, web, and mobile. The announcement emphasizes both interactive collaboration and autonomous task execution.

8Openai Blog·1mo ago·source ↗

Introducing Codex

OpenAI has announced Codex, a new product or capability targeting software development and coding tasks. The announcement comes from OpenAI's official blog, suggesting a significant product or model release. The body content was not provided, but given the Codex name and OpenAI's history, this likely involves an AI-powered coding agent or updated code generation system. Further details on capabilities, pricing, and availability are expected in the full announcement.

8Openai Blog·1mo ago·source ↗

Evaluating Large Language Models Trained on Code

OpenAI published research on evaluating large language models trained on code, introducing the Codex model and the HumanEval benchmark for assessing code generation capabilities. The work established foundational methodology for measuring functional correctness of code produced by LLMs using a pass@k metric. This paper became a landmark reference for code-focused LLM evaluation and influenced subsequent code generation research across the field.

5Openai Blog·18d ago·source ↗

OpenAI expands Codex with plugins, sites, and annotations for non-engineering roles

OpenAI announced new Codex capabilities including plugins, sites, and annotations targeting analysts, marketers, designers, investors, and other non-engineering teams. The expansion positions Codex as a broader productivity platform beyond software development. This represents a product surface expansion for OpenAI's coding-focused AI agent.