Almanac
Guide · In-depth

NVIDIA: The Infrastructure Layer Powering the AI Era

NVIDIAIn-depthactive·v1 · live·generated 6d ago

Part of these paths

TL;DRNVIDIA has evolved from a GPU manufacturer into the indispensable substrate of the AI industry — supplying the compute that trains frontier models, the software stack that serves them, and increasingly the open-weights models that run on top of them. Its position is reinforced by a web of strategic partnerships with every major AI lab, while its own research arm pushes into physical AI, open-weights models, and AI-accelerated chip design.

Key takeaways

  • NVIDIA committed $30B to OpenAI's $110B investment round and separately partnered with OpenAI to deploy 10 gigawatts of AI datacenter capacity.
  • NVIDIA invested up to $10B in Anthropic and committed up to 1 GW of Grace Blackwell/Vera Rubin compute, with co-optimization of future architectures planned.
  • Mistral Large 3 was trained on 3,000 NVIDIA H200 GPUs; Mistral joined the NVIDIA Nemotron Coalition as a founding member; Mistral Compute's sovereign AI infrastructure offering is built on NVIDIA hardware.
  • NVIDIA's own Nemotron model family spans a wide range: Nemotron 3 Super 120B-A12B (hybrid Mamba-2/MoE, 1M-token context, 442 tokens/sec), Nemotron 3 Nano Omni (multimodal, long-context), and Cosmos 3 (open omni-model for physical AI).
  • NVIDIA's AI-for-chip-design program — NVCell, PrefixRL, ChipNeMo — delivers measurable gains: PrefixRL circuits are 20–30% better than human designs; NVCell redesigns ~2,500–3,000 layout cells overnight versus 10 engineer-months.
  • NVIDIA announced a $26B five-year investment in open-weights models, framed partly as a strategic response to Chinese labs building capable open-weights models on non-NVIDIA hardware.

What NVIDIA is

NVIDIA is a semiconductor and AI platform company whose GPUs — and increasingly its full-stack software and model ecosystem — form the dominant substrate on which the AI industry runs. Its hardware (H200, Grace Blackwell, Vera Rubin, DGX Spark, RTX Spark) powers training runs at every major frontier lab; its software (TensorRT-LLM, NeMo, NIM microservices, NemoClaw) shapes how models are served and deployed in production; and its own research arm now ships open-weights models across language, multimodal, and physical AI domains.

Why it matters

NVIDIA occupies a structurally unusual position: it is simultaneously a supplier to, investor in, and competitor of the labs it enables. Every major frontier lab — OpenAI, Anthropic, Mistral — depends on NVIDIA hardware, yet NVIDIA has made multi-billion-dollar equity investments in both OpenAI ($30B) and Anthropic (up to $10B), and releases its own models (Nemotron, Cosmos) that compete in the open-weights space. This creates a flywheel: hardware revenue funds model research, model research validates hardware, and the resulting ecosystem lock-in reinforces both.

The partnership web

The scale of NVIDIA's bilateral commitments is striking. With OpenAI, NVIDIA joined a $110B investment round at $30B and separately committed to deploying 10 gigawatts of AI datacenter capacity, with the first phase launching in 2026. With Anthropic, NVIDIA invested up to $10B and committed up to 1 GW of Grace Blackwell and Vera Rubin compute, with a deep technology partnership to co-optimize future NVIDIA architectures for Anthropic workloads — a meaningful concession of roadmap influence. With Mistral AI, NVIDIA co-released Mistral NeMo (a 12B model with 128k context), co-optimized Mistral Large 3 (trained on 3,000 H200 GPUs) and Mistral Small 4 for Blackwell/Hopper kernels and NVFP4 format, and anchored the Nemotron Coalition with Mistral as a founding member. Mistral's sovereign infrastructure product, Mistral Compute, is itself built on NVIDIA hardware.

With Hugging Face, NVIDIA launched a joint Training Cluster as a Service offering. With SpaceX, Anthropic's Colossus deal gave it access to 220,000+ NVIDIA GPUs — further cementing NVIDIA's presence even in deals it isn't a direct party to.

The Nemotron and Cosmos model families

NVIDIA's own model output has accelerated substantially. The Nemotron family now spans:

  • Nemotron 3 Super 120B-A12B: a hybrid Mamba-2/Transformer/MoE model activating 12B parameters per token, supporting 1M-token context, claiming 442 tokens/second (fastest in its size class), and leading open-weights models on the PinchBench agentic evaluation — outperforming models with far more total parameters.
  • Nemotron 3 Nano Omni: a multimodal model targeting long-context understanding across documents, audio, and video for agentic use cases.
  • Nemotron 3 Nano 4B: a hybrid Mamba-Transformer on-device model for edge deployment.
  • Nemotron 3.5 Content Safety: a multimodal enterprise safety model for content moderation.
  • Nemotron-Labs diffusion LMs: a research push into non-autoregressive generation for inference speed.

The Cosmos family targets physical AI — robotics and embodied agents:

  • Cosmos 3: released as the first open omni-model for physical AI reasoning and action.
  • Cosmos Reason 2: adds advanced reasoning to physical AI applications.
  • Cosmos Predict 2.5: a world model fine-tunable with LoRA/DoRA for robot video generation.

NVIDIA also released Ising, a family of open AI models targeting quantum processor calibration and error correction, achieving 2.5x faster and 3x more accurate decoding than pyMatching, with adoption by Fermilab and Harvard.

AI-accelerated chip design

NVIDIA is applying AI to its own design pipeline at scale. Chief scientist Bill Dally described five-stage AI integration at GTC 2025:

  • NVCell: an RL + genetic algorithm system that redesigns ~2,500–3,000 layout cells overnight, a task that previously required 10 engineer-months.
  • PrefixRL: RL-designed arithmetic circuits that are 20–30% better than human designs.
  • ChipNeMo / BugNeMo: LLaMA 2-based LLMs fine-tuned on internal GPU documentation for engineering assistance.

Dally acknowledged that fully autonomous GPU design from a prompt remains distant, but the measurable gains at each stage represent a compounding advantage in design velocity.

Enterprise software: NemoClaw and NIM

Beyond hardware and models, NVIDIA is building the enterprise middleware layer. NemoClaw, unveiled at GTC 2026, integrates with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. NIM (NVIDIA Inference Microservices) packages models — including third-party ones like Mistral Small 4 — for standardized deployment. TensorRT-LLM is now supported as a backend in Hugging Face's Text Generation Inference, extending its reach into the open-source serving ecosystem.

Geopolitical exposure and the open-weights bet

NVIDIA's dominance is not without risk. DeepSeek withheld pre-release access to its V4 model from NVIDIA (and AMD) while sharing it with Huawei — a signal of deepening supply-chain fragmentation as China pushes domestic chip self-sufficiency. An unnamed Trump administration official claimed DeepSeek-V4 was trained on NVIDIA's most advanced chips despite U.S. export controls, though the sourcing is unverified. NVIDIA's announced $26B five-year investment in open-weights models is partly framed as a strategic response: if capable open-weights models proliferate on non-NVIDIA hardware, NVIDIA's software and ecosystem advantages become the differentiator.

Where it's heading

The events in this bundle point to three converging bets: (1) NVIDIA as the preferred compute layer for every major AI lab, reinforced by equity stakes and architecture co-optimization agreements; (2) NVIDIA as a first-class model publisher in the open-weights ecosystem, with Nemotron and Cosmos positioned to anchor the Nemotron Coalition's output; and (3) NVIDIA as the enterprise middleware provider for agentic AI, via NIM, NemoClaw, and the DGX Spark edge platform. The binding constraint — geopolitical fragmentation and the rise of alternative silicon — is real, but NVIDIA's response is to deepen software and ecosystem lock-in faster than hardware alternatives can mature.

NVIDIA's strategic position: hardware, models, and partnerships

NVIDIA's Nemotron open-weights model family at a glance

ModelArchitectureContextKey capabilityLicense
Nemotron 3 Super 120B-A12BHybrid Mamba-2 / Transformer / MoE (12B active)1M tokens442 tok/s; leads PinchBench agentic evalPermissive commercial
Nemotron 3 Nano OmniMoE omnimodalLong-contextDocuments, audio, video agents
Nemotron 3 Nano 4BHybrid Mamba-TransformerOn-device deployment
Cosmos 3Omni-modelPhysical AI reasoning & action (open-weights)Open
Cosmos Reason 2Advanced reasoning for physical AI
Nemotron 3.5 Content SafetyMultimodalEnterprise content moderation

Sourced from event bundle; unknown cells render —.

Timeline

  1. Mistral NeMo 12B released jointly with NVIDIA — first major co-branded model

  2. GTC 2025: NVIDIA announces open models and datasets for physical AI developers

  3. Hugging Face + NVIDIA launch Training Cluster as a Service; Mistral Compute built on NVIDIA hardware

  4. OpenAI–NVIDIA 10 GW datacenter partnership announced

  5. NVIDIA invests up to $10B in Anthropic; commits 1 GW Grace Blackwell/Vera Rubin compute

  6. NVIDIA commits $30B to OpenAI's $110B investment round

  7. Mistral joins NVIDIA Nemotron Coalition as founding member

  8. Nemotron 3 Super 120B released; $26B open-weights investment announced

  9. Cosmos 3 released as first open omni-model for physical AI

Related topics

FAQ

Is NVIDIA just a chip company, or does it build AI models too?

Both. NVIDIA's hardware (H200, Grace Blackwell, Vera Rubin) is the dominant compute substrate for AI training and inference, but the company also develops and releases its own open-weights model families — Nemotron and Cosmos — targeting language, multimodal, and physical AI applications.

How exposed is NVIDIA to geopolitical chip restrictions?

Significantly: DeepSeek withheld pre-release access to its V4 model from NVIDIA (and AMD) while sharing it with Huawei, signaling deepening supply-chain fragmentation. NVIDIA's $26B open-weights investment is partly framed as a strategic response to Chinese labs building capable models on non-NVIDIA hardware.

What is the Nemotron Coalition?

A multi-lab initiative led by NVIDIA to advance open-source frontier foundation models; Mistral AI joined as a founding member, with the first deliverable being a base model trained on DGX Cloud underpinning the Nemotron 4 family.

How does NVIDIA use AI in its own chip design?

NVIDIA's NVCell system redesigns ~2,500–3,000 layout cells overnight (vs. 10 engineer-months manually), PrefixRL produces arithmetic circuits 20–30% better than human designs, and ChipNeMo/BugNeMo are LLaMA 2-based LLMs fine-tuned on internal GPU documentation for engineering assistance.

What is NemoClaw?

NemoClaw is an enterprise software stack unveiled at GTC 2026 that integrates with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike.

Stay current

Call Me Almanac pairs the week's AI news with guides like this one — Midweek & Sunday.

Versions

  • v1live6d ago

Related guides (4)

More on NVIDIA (6)

7Mistral Ai News·1mo ago·source ↗

Mistral AI joins NVIDIA Nemotron Coalition as founding member, co-developing open frontier models

Mistral AI has announced a strategic partnership with NVIDIA as a founding member of the newly formed NVIDIA Nemotron Coalition, a multi-lab initiative to advance open-source frontier foundation models. The collaboration will combine Mistral's model architectures, multimodal capabilities, and fine-tuning expertise with NVIDIA's DGX Cloud compute and synthetic data pipelines. The coalition's first deliverable is a base model trained on DGX Cloud that will underpin the upcoming NVIDIA Nemotron 4 model family, to be open-sourced. Coinciding with the announcement, Mistral is also releasing Mistral Small 4.

8Openai Blog·1mo ago·source ↗

OpenAI and NVIDIA Announce Strategic Partnership to Deploy 10 Gigawatts of AI Datacenters

OpenAI and NVIDIA have announced a strategic partnership targeting deployment of 10 gigawatts of AI datacenter capacity powered by NVIDIA systems. The first phase of the buildout is scheduled to launch in 2026. This represents a major infrastructure commitment between two of the most prominent organizations in AI compute and model development.

6The Batch·19d ago·source ↗

Nvidia's AI Systems Design Chip Circuits, Verify Designs, and Test New Layouts

Nvidia chief scientist Bill Dally described the company's use of AI across five stages of chip design at GTC 2025, including NVCell (a RL+genetic algorithm system that redesigns ~2,500-3,000 layout cells overnight vs. 10 engineer-months), PrefixRL (RL-designed arithmetic circuits 20-30% better than human designs), and ChipNeMo/BugNeMo (LLaMA 2-based LLMs fine-tuned on internal GPU documentation). The systems demonstrate measurable improvements over human and industry-standard designs, though Dally acknowledged that fully autonomous GPU design from a prompt remains a distant goal. The piece also references a 2025 Verkoran paper describing an agentic system that autonomously designed a RISC-V CPU from a 219-word specification.

6Latent Space·18d ago·source ↗

NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark

A Latent Space AI news digest covers three NVIDIA announcements: Cosmos 3 (a world model/simulation platform), Nemotron 3 Ultra (a large language model), and RTX Spark (likely a new hardware or inference product). The piece frames these as a significant win for Jensen Huang and NVIDIA's AI portfolio. Coverage is commentary-tier aggregation rather than primary technical reporting.

7The Batch·18d ago·source ↗

Nvidia releases Nemotron 3 Super 120B-A12B open-weights model with hybrid Mamba-2/MoE architecture

Nvidia released Nemotron 3 Super 120B-A12B, an open-weights LLM with a hybrid Mamba-2/transformer/MoE architecture that activates only 12B parameters per token and supports up to 1 million token context. The model claims the fastest inference speed in its size class at 442 tokens/second and leads open-weights models on PinchBench agentic task evaluation, outperforming larger models including Kimi K2.5 (1T parameters). Nvidia is releasing weights, training data, and recipes under a permissive commercial license, and plans a $26B five-year investment in open-weights models — framed partly as a strategic response to Chinese labs building capable open-weights models on non-Nvidia hardware.

7The Batch·33h ago·source ↗

Nvidia Nemotron 3 Ultra: hybrid Mamba-transformer open-weights model targeting agentic workloads

Nvidia released Nemotron 3 Ultra, a 550B parameter (55B active) hybrid Mamba-transformer mixture-of-experts model with a 1M token context window, publishing weights, training data, and RL environments under an open license. The model ranks as the highest-scoring U.S. open-weights model on the Artificial Analysis Intelligence Index (47.7-48.2) and is approximately three times faster than comparable open-weights rivals, though it trails leading Chinese models like Kimi K2.6 and DeepSeek V4 Pro on intelligence benchmarks. Nvidia used a novel Multi-Teacher On-Policy Distillation approach with 10+ specialized teacher models and trained using NVFP4 quantization. The release is strategically motivated by Nvidia's interest in a healthy open-weights ecosystem that drives AI semiconductor adoption.