7OpenAI Blog·1mo ago

Introducing Triton: Open-source GPU programming for neural networks

OpenAI released Triton 1.0, an open-source Python-like language for GPU programming targeting neural network workloads. It enables researchers without CUDA expertise to write highly efficient GPU kernels, reportedly matching expert-level performance in most cases. The release lowers the barrier to custom GPU kernel development for ML practitioners.

Training Infrastructure Inference Economics Agent and Tool Ecosystem Triton Python OpenAI CUDA

Related guides (4)

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating

Read asIn-depth

Training InfrastructureTopic guide

Training Infrastructure: The Compute Arms Race Powering Modern AI

Read asBeginner In-depth

Inference EconomicsTopic guide

Inference Economics: The Cost Structure of Running AI Models in Production

Read asIn-depth

Related events (8)

4Openai Blog·1mo ago·source ↗

OpenAI Releases Block-Sparse GPU Kernels for Sparse Neural Networks

OpenAI released optimized GPU kernels targeting block-sparse neural network architectures, claiming orders-of-magnitude speedups over cuBLAS and cuSPARSE depending on sparsity level. The kernels were applied to achieve state-of-the-art results in text sentiment analysis and generative modeling of text and images. This release represents an early infrastructure contribution toward efficient sparse computation in deep learning.

Training Infrastructure Inference Economics cuBLAS cuSPARSE block-sparse GPU kernels +2 more

5Hugging Face Blog·1mo ago·source ↗

We Got Claude to Build CUDA Kernels and Teach Open Models

A Hugging Face blog post describes using Claude to generate CUDA kernels and then distilling that knowledge into open-weight models. The approach combines LLM-assisted low-level GPU programming with knowledge transfer to smaller open models. This sits at the intersection of AI-assisted systems programming and open-weights capability improvement.

Training Infrastructure Open Weights Progress Claude Hugging Face CUDA +2 more

7Openai Blog·1mo ago·source ↗

Introducing Prism: OpenAI's LaTeX-Native Research Workspace with GPT-5.2

OpenAI has launched Prism, a free LaTeX-native workspace designed for researchers that integrates GPT-5.2 directly into the writing and collaboration environment. The product targets academic and scientific workflows, combining document authoring with AI-assisted reasoning in a single interface. This also marks a public reference to GPT-5.2, indicating a model iteration beyond GPT-5.

Frontier Model Releases Enterprise Deployment Patterns GPT-5.2 LaTeX OpenAI +2 more

7The Batch·34h ago·source ↗

Nvidia Nemotron 3 Ultra: hybrid Mamba-transformer open-weights model targeting agentic workloads

Nvidia released Nemotron 3 Ultra, a 550B parameter (55B active) hybrid Mamba-transformer mixture-of-experts model with a 1M token context window, publishing weights, training data, and RL environments under an open license. The model ranks as the highest-scoring U.S. open-weights model on the Artificial Analysis Intelligence Index (47.7-48.2) and is approximately three times faster than comparable open-weights rivals, though it trails leading Chinese models like Kimi K2.6 and DeepSeek V4 Pro on intelligence benchmarks. Nvidia used a novel Multi-Teacher On-Policy Distillation approach with 10+ specialized teacher models and trained using NVFP4 quantization. The release is strategically motivated by Nvidia's interest in a healthy open-weights ecosystem that drives AI semiconductor adoption.

Frontier Model Releases Open Weights Progress Mamba IFBench Artificial Analysis Intelligence Index +17 more

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

Frontier Model Releases Inference Economics Midjourney Mamba Carnegie Mellon University +19 more

7Mistral Ai News·1mo ago·source ↗

Mistral AI joins NVIDIA Nemotron Coalition as founding member, co-developing open frontier models

Mistral AI has announced a strategic partnership with NVIDIA as a founding member of the newly formed NVIDIA Nemotron Coalition, a multi-lab initiative to advance open-source frontier foundation models. The collaboration will combine Mistral's model architectures, multimodal capabilities, and fine-tuning expertise with NVIDIA's DGX Cloud compute and synthetic data pipelines. The coalition's first deliverable is a base model trained on DGX Cloud that will underpin the upcoming NVIDIA Nemotron 4 model family, to be open-sourced. Coinciding with the announcement, Mistral is also releasing Mistral Small 4.

Training Infrastructure Frontier Model Releases Mistral AI Mistral Small 4 Arthur Mensch +8 more

5Hugging Face Blog·1mo ago·source ↗

Custom CUDA Kernels for All from Codex and Claude

A Hugging Face blog post describes using AI coding agents (Codex and Claude) to automatically generate custom CUDA kernels, lowering the barrier to GPU kernel development. The piece demonstrates agent-assisted GPU programming as a practical workflow for ML practitioners. This represents a concrete application of AI coding tools to the specialized domain of CUDA/GPU optimization.

Training Infrastructure Inference Economics Claude Hugging Face OpenAI +4 more

5Hugging Face Blog·1mo ago·source ↗

Hugging Face Launches Kernel Hub for Custom GPU Kernels

Hugging Face has introduced the Kernel Hub, a centralized repository for sharing and discovering custom GPU kernels optimized for AI/ML workloads. The platform aims to make high-performance custom CUDA and Triton kernels more accessible to the broader ML community. This represents an infrastructure layer addition to the Hugging Face ecosystem, complementing its existing model and dataset hubs.

Training Infrastructure Inference Economics Triton Hugging Face Hugging Face Kernel Hub +2 more