Almanac
Guide · In-depth

OpenAI: From Research Lab to Frontier AI Infrastructure Company

OpenAIIn-depthactive·v1 · live·generated 6d ago

Part of these paths

TL;DROpenAI began as a nonprofit AI safety research lab and has evolved into the most capitalised AI company in history, shipping the models and products that defined the modern large language model era. Its trajectory — from GPT-1 through ChatGPT to the GPT-5 family — traces the arc of the entire field, and its recent moves into open weights, defence contracts, and multi-cloud infrastructure signal a company repositioning itself as foundational AI infrastructure rather than a product studio.

Key takeaways

  • GPT-3 (175B parameters, 2020) and the 2020 scaling laws paper established the empirical foundations that the entire industry now builds on.
  • ChatGPT's November 2022 launch was the single highest-significance public AI adoption event in the bundle, making LLMs mainstream.
  • OpenAI raised $110B at a $730B valuation (Feb 2026) and followed with a $122B round (Mar 2026) — among the largest private capital raises in history.
  • The GPT-5 family introduced a unified routing architecture (gpt-5-main, gpt-5-thinking, lightweight variants) and the o-series introduced inference-time compute scaling via chain-of-thought reinforcement learning.
  • OpenAI released gpt-oss-120b and gpt-oss-20b under Apache 2.0 (Aug 2025), marking a strategic entry into open weights after years of proprietary-only releases.
  • OpenAI signed a Department of War contract for classified AI deployments and secured multi-cloud deals with Amazon and Microsoft totalling hundreds of billions in compute commitments.

What OpenAI is

OpenAI is an AI research and deployment company responsible for the GPT model family, ChatGPT, the o-series reasoning models, Sora (video generation), and a growing suite of enterprise and developer APIs. It is the organisation most directly associated with the modern large language model era: its research established the pre-training paradigm, its scaling laws gave the field a predictive framework for resource allocation, and its products brought LLMs to a mass audience.

Research foundations (2018–2021)

OpenAI's technical lineage begins with GPT-1 (2018), which demonstrated that unsupervised pre-training on large text corpora followed by task-specific fine-tuning could achieve state-of-the-art results across diverse NLP tasks — a paradigm that now underlies virtually every frontier model. The 2020 scaling laws paper formalised the power-law relationships between compute, data, parameters, and loss, giving practitioners a principled basis for training budget decisions. GPT-3 (175B parameters, 2020) operationalised few-shot learning at scale, showing that sufficiently large models could perform well on novel tasks with only a handful of examples and no gradient updates. CLIP (2021) extended the zero-shot transfer idea to vision, enabling natural-language-supervised image classification.

These papers were not just OpenAI milestones — they became the field's shared infrastructure, cited and built upon by every major lab that followed.

The ChatGPT inflection (2022–2023)

The November 2022 launch of ChatGPT was qualitatively different from prior model releases: it was a consumer product designed for dialogue, capable of acknowledging errors, challenging incorrect premises, and declining inappropriate requests. The conversational interface collapsed the distance between frontier model capability and general-user accessibility. GPT-4 followed in March 2023, adding multimodal inputs (image + text) and human-level performance on professional and academic benchmarks.

The same period produced OpenAI's most significant governance crisis: in November 2023, the board abruptly removed CEO Sam Altman, triggering a multi-day standoff that ended with his reinstatement and a restructured board. The episode exposed the structural tensions between OpenAI's nonprofit origins and its commercial trajectory.

Architectural pivots: multimodality and inference-time scaling (2024)

Two architectural bets defined 2024. GPT-4o (May 2024) introduced a natively omnimodal architecture processing text, audio, and vision in a unified model without separate pipeline stages. Sora (February 2024) framed video generation as a path toward general-purpose physical world simulation, operating on spacetime patches of video and image latent codes via a transformer architecture.

The o1 release (September 2024) represented a more fundamental shift: rather than scaling training compute further, OpenAI introduced inference-time compute scaling — training models to spend more time "thinking" via chain-of-thought reinforcement learning before responding. o1-preview ranked in the 89th percentile on competitive programming and at PhD level on science benchmarks. o3 and o4-mini (April 2025) extended this line with full tool access, integrating reasoning with agentic capabilities.

The GPT-5 era and open weights (2025–2026)

GPT-5 (August 2025) introduced a unified routing architecture that dynamically selects among sub-models — gpt-5-main, gpt-5-thinking, and lightweight variants including gpt-5-thinking-nano — balancing speed and capability by task. The system card published alongside the launch provided the first official safety and capability disclosure for the family.

Immediately before GPT-5, OpenAI made a strategic reversal: it released gpt-oss-120b and gpt-oss-20b under the Apache 2.0 license, optimised for consumer hardware and claiming to outperform similarly sized open models on reasoning and tool use. This was OpenAI's first significant open-weights release, signalling competitive pressure from the open-source ecosystem (notably DeepSeek-R1, which had claimed parity with o1 at a fraction of the API cost).

The GPT-5.x iteration cycle accelerated through late 2025 and into 2026: GPT-5.2 (December 2025) targeted professional reasoning and agentic workflows; GPT-5.4 (March 2026) added a 1.05M-token context window, native computer use, and tool search, with Pro-tier pricing at $30/$180 per million input/output tokens; GPT-5.4 mini and nano extended the family for efficiency-sensitive deployments; GPT-5.5 (April 2026) pushed further on speed and reasoning. GPT-Rosalind (April 2026) marked OpenAI's first domain-specialised frontier model, targeting drug discovery, genomics, and protein reasoning.

Scientific capability demonstrations

Two results in the bundle represent a qualitative claim about AI's role in frontier science. GPT-5.2 proposed a novel formula for a gluon amplitude in theoretical physics that was subsequently formally proved by OpenAI researchers and academic collaborators. A later OpenAI model disproved the Erdős planar unit distance conjecture — an 80-year-old open problem in discrete geometry — reportedly at a compute cost under $1,000. Both results were independently verified, distinguishing them from AI-assisted proof-checking or known-result reproduction.

A separate biosecurity benchmark (ABC-Bench) found that OpenAI's o4-mini-high produced scripts that successfully assembled DNA on a liquid-handling robot, outperforming median expert humans — a result that highlights the dual-use risk surface of capable agentic models.

Capital, compute, and infrastructure

OpenAI's financial trajectory is without precedent in private technology: a $110B round at a $730B valuation (February 2026, anchored by $30B from SoftBank, $30B from NVIDIA, and $50B from Amazon) was followed within weeks by a $122B raise earmarked for global frontier AI development and compute infrastructure. The Stargate Project (announced January 2025) targets up to $500B in US AI infrastructure over four years.

Cloud relationships have grown more complex. Microsoft remains the primary partner — its 2019 $1B investment and exclusive cloud arrangement laid the foundation for all of OpenAI's large-scale training — but OpenAI has diversified: a strategic AWS partnership (February 2026) brings OpenAI Frontier to Amazon Bedrock for stateful agent workloads, exploiting a legal distinction that preserves Microsoft's exclusive rights over stateless API calls. Amazon committed up to $35B in investment and $100B in Trainium compute over eight years.

Government and defence posture

OpenAI signed a formal contract with the U.S. Department of War (February 2026) covering AI deployment in classified environments, with negotiated safety red lines. The contract allows use of OpenAI models "for all lawful purposes" — a formulation that Altman later described as rushed and subsequently renegotiated. This contrasts sharply with Anthropic's refusal to remove restrictions on autonomous weapons and mass domestic surveillance, which led to Anthropic's formal designation as a supply-chain risk to national security. OpenAI's willingness to engage with defence use cases, even with friction, positions it as the default frontier AI vendor for US government workloads.

Competitive landscape

The events bundle surfaces two primary competitive axes. Against Anthropic: Claude Opus 4.6 claims to outperform GPT-5.2 by 144 Elo on GDPval-AA and leads on several other benchmarks, while OpenAI's GPT-5.4 Pro claims SOTA on GDP-Val-AA, SWE-Bench-Pro, and Terminal-Bench-Hard — benchmark leadership is contested and benchmark-specific. Against the open-source ecosystem: DeepSeek-R1 claimed o1 parity at dramatically lower API cost ($0.55/$2.19 per million tokens vs. OpenAI's top-tier pricing), which likely accelerated OpenAI's open-weights release.

Where it's heading

The pattern across the bundle is consistent: OpenAI is expanding from model provider to AI infrastructure layer — multi-cloud, multi-modal, open-weight where strategically useful, and now embedded in classified government systems. The pace of model iteration (five named GPT-5.x releases within roughly six months), the scale of capital deployment, and the domain-specialised model strategy (GPT-Rosalind for life sciences) all point toward a company that intends to be present at every layer of the AI stack, not just the frontier model tier.

OpenAI model lineage: key architectural pivots

OpenAI GPT-5 family at a glance

ModelContext windowKey capabilityPricing (input/output per M tokens)Notable
GPT-5SOTA coding, math, writing, vision at launchUnified routing: gpt-5-main + gpt-5-thinking + nano variants
GPT-5.2Reasoning, long-context, vision; new physics resultFirst GPT-5.x; agentic workflow focus
GPT-5.4 / Pro1.05M tokensComputer use, tool search, SOTA on GDP-Val-AA & SWE-Bench-Pro$30 / $180Powers Codex; Thinking and Pro variants
GPT-5.4 mini / nanoEfficiency-optimised for agentic pipelinesSub-agent and high-volume API workloads
GPT-5.5Speed + reasoning for coding, research, data analysisMost capable at announcement; system card published
gpt-oss-120b / 20bOpen-weight reasoning, strong tool useApache 2.0; consumer-hardware optimised

All data from the events bundle; unknown cells render —.

Timeline

  1. GPT-1 published — pre-train + fine-tune paradigm established

  2. Microsoft invests $1B; becomes exclusive cloud provider

  3. Scaling laws paper published — predictive framework for LLM training

  4. GPT-3 (175B) released — few-shot learning at scale

  5. CLIP introduced — zero-shot vision via language supervision

  6. ChatGPT launched — LLMs go mainstream

  7. GPT-4 released — multimodal, human-level benchmarks

  8. Leadership crisis: Altman removed then reinstated; board restructured

  9. Sora introduced — video generation as world simulation

  10. GPT-4o (Omni) — native multimodal across text, audio, vision

  11. o1 released — inference-time compute scaling via chain-of-thought RL

  12. Stargate Project announced — up to $500B in US AI infrastructure

  13. gpt-oss-120b / 20b released under Apache 2.0 — OpenAI enters open weights

  14. GPT-5 launched with unified routing architecture

  15. $110B raised at $730B valuation; Amazon strategic partnership announced

  16. Department of War contract signed for classified AI deployments

  17. GPT-5.4 released — 1M context, computer use, tool search

  18. $122B additional funding announced for global frontier AI expansion

  19. OpenAI model disproves 80-year-old Erdős unit distance conjecture

Related topics

FAQ

What is the relationship between OpenAI and Microsoft?

Microsoft became OpenAI's exclusive cloud provider with a $1B investment in 2019; the relationship has since deepened into a multi-hundred-billion compute commitment, though OpenAI has begun diversifying to AWS for stateful agent workloads while Microsoft retains exclusive rights to host stateless API calls.

What distinguishes the o-series (o1, o3, o4-mini) from the GPT series?

The o-series models use inference-time compute scaling — spending more compute at generation time via chain-of-thought reinforcement learning — rather than relying solely on training-time scale, making them particularly strong on math, science, and coding benchmarks.

Has OpenAI released any open-weight models?

Yes — gpt-oss-120b and gpt-oss-20b were released in August 2025 under the Apache 2.0 license, marking OpenAI's first significant open-weights release after years of proprietary-only deployments.

What is the Stargate Project?

Stargate is a joint infrastructure initiative announced in January 2025 targeting up to $500 billion in US AI compute and data center investment over four years, involving OpenAI and partners including SoftBank.

How does OpenAI's defence posture differ from Anthropic's?

OpenAI signed a Department of War contract allowing use of its models 'for all lawful purposes' with negotiated safety carve-outs, while Anthropic refused to remove restrictions on autonomous weapons and mass surveillance and was formally designated a supply-chain risk — a stark divergence in how the two labs handle government and military use.

What scientific breakthroughs have OpenAI models produced?

GPT-5.2 proposed a novel gluon amplitude formula in theoretical physics that was subsequently formally proved, and a later OpenAI model disproved the 80-year-old Erdős planar unit distance conjecture in discrete geometry — both representing AI-generated, independently verified new scientific results.

Stay current

Call Me Almanac pairs the week's AI news with guides like this one — Midweek & Sunday.

Versions

  • v1live6d ago

Related guides (4)

More on OpenAI (6)

7Latent Space·1mo ago·source ↗

GPT-Realtime-2, GPT-Translate, and new Whisper: OpenAI's new SOTA realtime voice APIs

OpenAI has released a suite of new real-time voice and audio APIs including GPT-Realtime-2, a GPT-Translate model, and an updated Whisper, all positioned as state-of-the-art for real-time voice applications. The releases appear to be part of a broader push to deploy GPT-5 capabilities across multiple product surfaces. Coverage comes from the Latent Space AI News digest, which aggregates and contextualizes the announcements.

7Openai Blog·1mo ago·source ↗

OpenAI launches DeployCo enterprise deployment company

OpenAI has announced DeployCo, a new enterprise-focused deployment company aimed at helping organizations integrate frontier AI into production environments and generate measurable business outcomes. The move represents OpenAI expanding beyond model development into a dedicated deployment and professional services arm. This signals a strategic shift toward capturing enterprise value from AI adoption, not just model licensing.

6The Batch·1mo ago·source ↗

OpenAI Updates Audio Models That Reason, Transcribe, and Translate

OpenAI introduced three new audio models in its Realtime API: GPT-Realtime-2 (speech-to-speech with five configurable reasoning effort levels), GPT-Realtime-Translate (70+ input languages), and GPT-Realtime-Whisper (transcription). GPT-Realtime-2 operates as an end-to-end audio model including reasoning, with latency ranging from 1.12 seconds at minimal effort to 2.33 seconds at high effort. Benchmark results are mixed: it leads Scale AI's Audio MultiChallenge and Artificial Analysis Conversational Dynamics but trails Step-Audio R1.1 Realtime and Grok Voice Think Fast 1.0 on speech reasoning and agentic tasks. The configurable reasoning-latency tradeoff is positioned as a key differentiator for voice agent applications.

7Openai Blog·1mo ago·source ↗

OpenAI Launches GPT-5.5 and GPT-5.5-Cyber with Expanded Trusted Access for Cyber Program

OpenAI is expanding its Trusted Access for Cyber program with two new models: GPT-5.5 and GPT-5.5-Cyber, a specialized variant aimed at cybersecurity applications. The program provides verified defenders with access to these models to accelerate vulnerability research and protect critical infrastructure. This represents a continuation of OpenAI's strategy of releasing domain-specialized model variants with controlled access tiers for sensitive use cases.

7Openai Blog·1mo ago·source ↗

Advancing voice intelligence with new models in the API

OpenAI is releasing new realtime voice models via its API with capabilities spanning reasoning, translation, and transcription. The announcement targets developers building voice-enabled applications and represents an expansion of OpenAI's voice intelligence offerings beyond the existing Realtime API. The models are positioned to enable more natural and intelligent voice experiences in production deployments.

7Openai Blog·1mo ago·source ↗

Testing ads in ChatGPT

OpenAI has announced it is beginning to test advertising within ChatGPT as a mechanism to support free-tier access. The company states ads will be clearly labeled, will not influence answer content, and will include privacy protections and user controls. This marks a significant monetization strategy shift for OpenAI's flagship consumer product.