Almanac
← Events
7OpenAI Blog·1mo ago

Introducing Triton: Open-source GPU programming for neural networks

OpenAI released Triton 1.0, an open-source Python-like language for GPU programming targeting neural network workloads. It enables researchers without CUDA expertise to write highly efficient GPU kernels, reportedly matching expert-level performance in most cases. The release lowers the barrier to custom GPU kernel development for ML practitioners.

Related guides (4)

Related events (8)

4Openai Blog·1mo ago·source ↗

OpenAI Releases Block-Sparse GPU Kernels for Sparse Neural Networks

OpenAI released optimized GPU kernels targeting block-sparse neural network architectures, claiming orders-of-magnitude speedups over cuBLAS and cuSPARSE depending on sparsity level. The kernels were applied to achieve state-of-the-art results in text sentiment analysis and generative modeling of text and images. This release represents an early infrastructure contribution toward efficient sparse computation in deep learning.

5Hugging Face Blog·1mo ago·source ↗

We Got Claude to Build CUDA Kernels and Teach Open Models

A Hugging Face blog post describes using Claude to generate CUDA kernels and then distilling that knowledge into open-weight models. The approach combines LLM-assisted low-level GPU programming with knowledge transfer to smaller open models. This sits at the intersection of AI-assisted systems programming and open-weights capability improvement.

7Openai Blog·1mo ago·source ↗

Introducing Prism: OpenAI's LaTeX-Native Research Workspace with GPT-5.2

OpenAI has launched Prism, a free LaTeX-native workspace designed for researchers that integrates GPT-5.2 directly into the writing and collaboration environment. The product targets academic and scientific workflows, combining document authoring with AI-assisted reasoning in a single interface. This also marks a public reference to GPT-5.2, indicating a model iteration beyond GPT-5.

7The Batch·34h ago·source ↗

Nvidia Nemotron 3 Ultra: hybrid Mamba-transformer open-weights model targeting agentic workloads

Nvidia released Nemotron 3 Ultra, a 550B parameter (55B active) hybrid Mamba-transformer mixture-of-experts model with a 1M token context window, publishing weights, training data, and RL environments under an open license. The model ranks as the highest-scoring U.S. open-weights model on the Artificial Analysis Intelligence Index (47.7-48.2) and is approximately three times faster than comparable open-weights rivals, though it trails leading Chinese models like Kimi K2.6 and DeepSeek V4 Pro on intelligence benchmarks. Nvidia used a novel Multi-Teacher On-Policy Distillation approach with 10+ specialized teacher models and trained using NVFP4 quantization. The release is strategically motivated by Nvidia's interest in a healthy open-weights ecosystem that drives AI semiconductor adoption.

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

7Mistral Ai News·1mo ago·source ↗

Mistral AI joins NVIDIA Nemotron Coalition as founding member, co-developing open frontier models

Mistral AI has announced a strategic partnership with NVIDIA as a founding member of the newly formed NVIDIA Nemotron Coalition, a multi-lab initiative to advance open-source frontier foundation models. The collaboration will combine Mistral's model architectures, multimodal capabilities, and fine-tuning expertise with NVIDIA's DGX Cloud compute and synthetic data pipelines. The coalition's first deliverable is a base model trained on DGX Cloud that will underpin the upcoming NVIDIA Nemotron 4 model family, to be open-sourced. Coinciding with the announcement, Mistral is also releasing Mistral Small 4.

5Hugging Face Blog·1mo ago·source ↗

Custom CUDA Kernels for All from Codex and Claude

A Hugging Face blog post describes using AI coding agents (Codex and Claude) to automatically generate custom CUDA kernels, lowering the barrier to GPU kernel development. The piece demonstrates agent-assisted GPU programming as a practical workflow for ML practitioners. This represents a concrete application of AI coding tools to the specialized domain of CUDA/GPU optimization.

5Hugging Face Blog·1mo ago·source ↗

Hugging Face Launches Kernel Hub for Custom GPU Kernels

Hugging Face has introduced the Kernel Hub, a centralized repository for sharing and discovering custom GPU kernels optimized for AI/ML workloads. The platform aims to make high-performance custom CUDA and Triton kernels more accessible to the broader ML community. This represents an infrastructure layer addition to the Hugging Face ecosystem, complementing its existing model and dataset hubs.