company

NVIDIA

companyactivenvidia-fb6264b2·65 events·first seen 1mo ago

Aliases: NVIDIA

Co-occurring entities

More like this (12)

NVIDIA Labs NVIDIA Isaac Optimum-NVIDIA NVIDIA NIM GPU NVIDIA Cosmos NVIDIA Nemotron Coalition Nvidia A100 NVIDIA Nemotron 4 CUDA NVIDIA DGX Cloud NVIDIA NeMo

Guides (1)

NVIDIA

NVIDIA: The Hardware Backbone of the AI Era

Read asBeginner In-depth

Recent events (50)

7Mistral Ai News·1mo ago·source ↗

Mistral AI joins NVIDIA Nemotron Coalition as founding member, co-developing open frontier models

Mistral AI has announced a strategic partnership with NVIDIA as a founding member of the newly formed NVIDIA Nemotron Coalition, a multi-lab initiative to advance open-source frontier foundation models. The collaboration will combine Mistral's model architectures, multimodal capabilities, and fine-tuning expertise with NVIDIA's DGX Cloud compute and synthetic data pipelines. The coalition's first deliverable is a base model trained on DGX Cloud that will underpin the upcoming NVIDIA Nemotron 4 model family, to be open-sourced. Coinciding with the announcement, Mistral is also releasing Mistral Small 4.

Training Infrastructure Frontier Model Releases Mistral AI Mistral Small 4 Arthur Mensch +8 more

8Openai Blog·1mo ago·source ↗

OpenAI and NVIDIA Announce Strategic Partnership to Deploy 10 Gigawatts of AI Datacenters

OpenAI and NVIDIA have announced a strategic partnership targeting deployment of 10 gigawatts of AI datacenter capacity powered by NVIDIA systems. The first phase of the buildout is scheduled to launch in 2026. This represents a major infrastructure commitment between two of the most prominent organizations in AI compute and model development.

Training Infrastructure Frontier Model Releases NVIDIA OpenAI +1 more

6The Batch·19d ago·source ↗

Nvidia's AI Systems Design Chip Circuits, Verify Designs, and Test New Layouts

Nvidia chief scientist Bill Dally described the company's use of AI across five stages of chip design at GTC 2025, including NVCell (a RL+genetic algorithm system that redesigns ~2,500-3,000 layout cells overnight vs. 10 engineer-months), PrefixRL (RL-designed arithmetic circuits 20-30% better than human designs), and ChipNeMo/BugNeMo (LLaMA 2-based LLMs fine-tuned on internal GPU documentation). The systems demonstrate measurable improvements over human and industry-standard designs, though Dally acknowledged that fully autonomous GPU design from a prompt remains a distant goal. The piece also references a 2025 Verkoran paper describing an agentic system that autonomously designed a RISC-V CPU from a 219-word specification.

Training Infrastructure Inference Economics Jeff Dean BugNeMo Verkoran +10 more

6Latent Space·18d ago·source ↗

NVIDIA Cosmos 3, Nemotron 3 Ultra, and RTX Spark

A Latent Space AI news digest covers three NVIDIA announcements: Cosmos 3 (a world model/simulation platform), Nemotron 3 Ultra (a large language model), and RTX Spark (likely a new hardware or inference product). The piece frames these as a significant win for Jensen Huang and NVIDIA's AI portfolio. Coverage is commentary-tier aggregation rather than primary technical reporting.

Training Infrastructure Frontier Model Releases NVIDIA Cosmos NVIDIA RTX Spark NVIDIA +4 more

7The Batch·18d ago·source ↗

Nvidia releases Nemotron 3 Super 120B-A12B open-weights model with hybrid Mamba-2/MoE architecture

Nvidia released Nemotron 3 Super 120B-A12B, an open-weights LLM with a hybrid Mamba-2/transformer/MoE architecture that activates only 12B parameters per token and supports up to 1 million token context. The model claims the fastest inference speed in its size class at 442 tokens/second and leads open-weights models on PinchBench agentic task evaluation, outperforming larger models including Kimi K2.5 (1T parameters). Nvidia is releasing weights, training data, and recipes under a permissive commercial license, and plans a $26B five-year investment in open-weights models — framed partly as a strategic response to Chinese labs building capable open-weights models on non-Nvidia hardware.

Frontier Model Releases Open Weights Progress Nemotron 3 Super 120B-A12B Nemotron 3 Ultra-500B-A50B PivotRL +18 more

7The Batch·33h ago·source ↗

Nvidia Nemotron 3 Ultra: hybrid Mamba-transformer open-weights model targeting agentic workloads

Nvidia released Nemotron 3 Ultra, a 550B parameter (55B active) hybrid Mamba-transformer mixture-of-experts model with a 1M token context window, publishing weights, training data, and RL environments under an open license. The model ranks as the highest-scoring U.S. open-weights model on the Artificial Analysis Intelligence Index (47.7-48.2) and is approximately three times faster than comparable open-weights rivals, though it trails leading Chinese models like Kimi K2.6 and DeepSeek V4 Pro on intelligence benchmarks. Nvidia used a novel Multi-Teacher On-Policy Distillation approach with 10+ specialized teacher models and trained using NVFP4 quantization. The release is strategically motivated by Nvidia's interest in a healthy open-weights ecosystem that drives AI semiconductor adoption.

Frontier Model Releases Open Weights Progress Mamba IFBench Artificial Analysis Intelligence Index +17 more

6Hugging Face Blog·1mo ago·source ↗

Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents

NVIDIA has released Nemotron 3 Nano Omni, a multimodal model targeting long-context understanding across documents, audio, and video modalities. The model is positioned for agentic use cases requiring cross-modal reasoning. It is published via the Hugging Face blog as part of NVIDIA's Nemotron model family. No detailed technical specifications or benchmark results are provided in the available body text.

Long Context Evolution Open Weights Progress Nemotron 3 Nano Omni NVIDIA Hugging Face +3 more

6Hugging Face Blog·1mo ago·source ↗

NVIDIA's GTC 2025 Announcement for Physical AI Developers: New Open Models and Datasets

NVIDIA announced new open models and datasets for physical AI development at GTC 2025, covered via the Hugging Face blog. The release targets robotics and embodied AI developers with open-weights resources. This represents NVIDIA's continued push into the physical AI ecosystem alongside its hardware dominance.

Open Weights Progress Agent and Tool Ecosystem NVIDIA Hugging Face GTC 2025 +1 more

6Hugging Face Blog·1mo ago·source ↗

NVIDIA Cosmos Reason 2 Brings Advanced Reasoning To Physical AI

NVIDIA has released Cosmos Reason 2, a model designed to bring advanced reasoning capabilities to physical AI applications. The announcement appears on the Hugging Face blog, indicating the model is likely available or accessible through the platform. This represents a continuation of NVIDIA's Cosmos model family targeting robotics and physical world understanding.

Frontier Model Releases Agent and Tool Ecosystem NVIDIA Cosmos Reason 2 NVIDIA Cosmos NVIDIA +2 more

5Hugging Face Blog·1mo ago·source ↗

NVIDIA brings agents to life with DGX Spark and Reachy Mini

NVIDIA is integrating its DGX Spark computing platform with the Reachy Mini robot to enable embodied AI agents. The collaboration, highlighted on the Hugging Face blog, demonstrates running agent workloads on edge hardware for robotics applications. This represents a convergence of NVIDIA's inference infrastructure with open robotics platforms.

Inference Economics Enterprise Deployment Patterns DGX Spark NVIDIA Hugging Face +2 more

4Hugging Face Blog·1mo ago·source ↗

How to Build a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac for Healthcare

This Hugging Face blog post covers NVIDIA Isaac for Healthcare, a simulation-to-deployment platform for building healthcare robots. It describes the workflow for training and deploying robotic systems in medical environments using NVIDIA's Isaac simulation stack. The post represents a practical guide bridging AI-driven robotics simulation with real-world healthcare deployment.

Enterprise Deployment Patterns Agent and Tool Ecosystem NVIDIA NVIDIA Isaac Hugging Face

6Hugging Face Blog·1mo ago·source ↗

NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset

NVIDIA has released a dataset of 6 million multilingual reasoning examples, published via Hugging Face. The dataset is intended to support training and evaluation of reasoning capabilities across multiple languages. This release addresses a known gap in multilingual reasoning data availability for the research community.

Frontier Model Releases Evaluation and Benchmarking NVIDIA Multilingual Reasoning Dataset v1 NVIDIA Hugging Face +1 more

5Hugging Face Blog·1mo ago·source ↗

NVIDIA Llama Nemotron Nano VLM Released on Hugging Face Hub

NVIDIA has released the Llama Nemotron Nano VLM on Hugging Face Hub, a compact vision-language model built on the Llama architecture. The model is part of NVIDIA's Nemotron family targeting efficient multimodal inference. This release makes the model accessible to the broader research and developer community through Hugging Face's model hosting infrastructure.

Open Weights Progress Inference Economics Llama Nemotron Nano VLM NVIDIA Hugging Face +3 more

6Hugging Face Blog·1mo ago·source ↗

Hugging Face and NVIDIA Launch Training Cluster as a Service

Hugging Face and NVIDIA are announcing a joint 'Training Cluster as a Service' offering, providing managed GPU cluster access for AI model training. The collaboration aims to lower the barrier for organizations to access large-scale training infrastructure without managing hardware directly. This represents a strategic partnership between a major AI platform and a leading GPU manufacturer to address enterprise training infrastructure needs.

Training Infrastructure Inference Economics NVIDIA Hugging Face Training Cluster as a Service +1 more

7Hugging Face Blog·19d ago·source ↗

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action

NVIDIA has released Cosmos 3, described as the first open omni-model targeting physical AI reasoning and action. The model is hosted and announced via Hugging Face, positioning it as an open-weights offering for robotics and embodied AI applications. The announcement highlights multimodal capabilities oriented toward physical world understanding and agent-level action.

Frontier Model Releases Open Weights Progress NVIDIA Cosmos NVIDIA Hugging Face +2 more

9Anthropic News·19d ago·source ↗

Microsoft, NVIDIA, and Anthropic Announce Major Strategic Partnerships with $15B Investment and $30B Azure Compute Commitment

Anthropic has announced simultaneous strategic partnerships with Microsoft and NVIDIA, committing to purchase $30 billion of Azure compute capacity and up to one gigawatt of compute with NVIDIA Grace Blackwell and Vera Rubin systems. NVIDIA and Microsoft are investing up to $10 billion and $5 billion respectively in Anthropic, while Claude models (Sonnet 4.5, Opus 4.1, Haiku 4.5) will be available on Microsoft Foundry and across the Copilot product family. Anthropic and NVIDIA are also establishing a deep technology partnership to co-optimize model performance and future NVIDIA architectures for Anthropic workloads. Amazon remains Anthropic's primary cloud and training partner.

Training Infrastructure Frontier Model Releases Dario Amodei Microsoft Copilot Claude Opus 4.6 +18 more

6The Batch·18d ago·source ↗

The Batch Issue 346: Nvidia Nemotron Super 120B, OpenAI-Amazon Deal, Regulatory Commentary

The Batch's weekly digest covers Nvidia's release of Nemotron 3 Super 120B-A12B, an open-weights hybrid mamba-2/transformer/MoE model with 1M token context trained on 25 trillion tokens, positioned as a speed leader in its size class for agentic applications. The issue also touches on OpenAI's Amazon deal and Grok video pricing cuts. Editor Andrew Ng's letter addresses the White House's proposed federal AI preemption framework and critiques what he characterizes as coordinated anti-AI messaging campaigns. Multiple significant industry developments are bundled in a single newsletter digest.

Frontier Model Releases Open Weights Progress Nemotron 3 Super 120B-A12B Nemotron 3 Ultra-500B-A50B DeepLearning.AI +9 more

6The Batch·17d ago·source ↗

Data Points: NemoClaw enterprise stack, GPT-5.4 mini/nano, Nemotron 3 Nano 4B, Midjourney V8, and Mamba-3

A multi-item roundup covers several AI developments: Nvidia unveiled NemoClaw at GTC 2026, an enterprise software stack integrating with OpenClaw to add security and governance for agentic deployments, with launch partners including Salesforce, Cisco, and CrowdStrike. OpenAI released GPT-5.4 mini and nano, smaller variants optimized for speed with benchmark results on SWE-Bench Pro and OSWorld-Verified, priced at $0.75 and $0.20 per million input tokens respectively. Nvidia also released Nemotron 3 Nano 4B, a hybrid Mamba-Transformer 4B parameter on-device model. Additional items cover Midjourney V8 alpha (5x faster, diffusion-only) and Mamba-3, a 1.5B state space model from CMU and Together.AI with improved accuracy over Mamba-2.

Frontier Model Releases Inference Economics Midjourney Mamba Carnegie Mellon University +19 more

5Hugging Face Blog·16d ago·source ↗

NVIDIA releases Nemotron 3.5 Content Safety, a customizable multimodal safety model for enterprise AI

NVIDIA has released Nemotron 3.5 Content Safety, a multimodal safety model designed for enterprise AI deployments with customization capabilities for global use cases. The model is announced via the Hugging Face blog, targeting content moderation and safety classification across modalities. This is relevant to the growing enterprise demand for controllable, deployable safety layers on top of foundation models.

AI Safety Research Enterprise Deployment Patterns Nemotron 3.5 Content Safety NVIDIA Hugging Face +1 more

4Github Trending·16d ago·source ↗

NVIDIA NemoClaw: Secure agent execution inside NVIDIA OpenShell with managed inference

NVIDIA has published NemoClaw, a TypeScript project on GitHub for running AI agents such as Hermes and OpenClaw more securely inside NVIDIA OpenShell with managed inference. The repository has accumulated over 20,000 stars, suggesting notable community interest. The project appears to be part of NVIDIA's broader NeMo ecosystem for enterprise AI agent deployment.

Inference Economics Agent and Tool Ecosystem Hermes OpenClaw NVIDIA +2 more

5Hugging Face Blog·1mo ago·source ↗

Mastering Long Contexts in LLMs with KVPress

NVIDIA and Hugging Face present KVPress, a library for compressing the KV cache in large language models to enable more efficient long-context inference. The tool implements multiple KV cache compression ("pressing") algorithms that reduce memory footprint and latency without retraining models. KVPress is positioned as a practical toolkit for deploying LLMs in long-context scenarios where KV cache size becomes a bottleneck.

Long Context Evolution Inference Economics KV Cache KVPress NVIDIA +2 more

4Hugging Face Blog·1mo ago·source ↗

The Open Evaluation Standard: Benchmarking NVIDIA Nemotron 3 Nano with NeMo Evaluator

NVIDIA and Hugging Face present an evaluation methodology for the Nemotron 3 Nano model using the NeMo Evaluator framework. The post describes benchmark results and an open evaluation recipe intended to standardize how small/nano-scale models are assessed. It positions NeMo Evaluator as a reproducible, open evaluation stack for the community.

Evaluation and Benchmarking Open Weights Progress Nemotron 3 Nano Omni NeMo Evaluator NVIDIA +1 more

5Hugging Face Blog·1mo ago·source ↗

Serverless Inference with Hugging Face and NVIDIA NIM

Hugging Face and NVIDIA have partnered to offer serverless inference via NVIDIA NIM microservices on DGX Cloud infrastructure. The integration allows developers to run optimized model inference without managing GPU infrastructure, combining Hugging Face's model hub with NVIDIA's inference optimization stack. This represents an expansion of the existing Hugging Face–NVIDIA partnership into managed inference services.

Training Infrastructure Inference Economics NVIDIA NIM NVIDIA DGX Cloud +2 more

4Hugging Face Blog·1mo ago·source ↗

Nemotron-Personas-Japan: Synthetic Dataset for Sovereign AI

NVIDIA has released Nemotron-Personas-Japan, a synthetic dataset hosted on Hugging Face designed to support sovereign AI development in Japan. The dataset appears to consist of persona-based synthetic data in Japanese, likely intended for fine-tuning or alignment of Japanese-language models. This release is part of NVIDIA's broader Nemotron data and model family, extending it to non-English sovereign AI use cases.

Open Weights Progress Enterprise Deployment Patterns Nemotron-Personas-Japan NVIDIA Hugging Face +2 more

5Hugging Face Blog·1mo ago·source ↗

Measuring Open-Source Llama Nemotron Models on DeepResearch Bench

NVIDIA evaluates its open-source Llama Nemotron models on the DeepResearch Bench, a benchmark designed to assess deep research agent capabilities. The post appears to report competitive performance of the Nemotron models in agentic research tasks. This is relevant to the ongoing development of open-weights models capable of multi-step research and reasoning workflows.

Evaluation and Benchmarking Open Weights Progress Llama Nemotron NVIDIA DeepResearch Bench +3 more

5Hugging Face Blog·1mo ago·source ↗

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

NVIDIA and Hugging Face demonstrate fine-tuning of the Isaac GR00T N1.5 robot foundation model on the SO-101 robotic arm using the LeRobot framework. The post covers post-training methodology to adapt the generalist robot policy to a specific hardware platform. This represents a practical integration between NVIDIA's robotics AI stack and Hugging Face's open robotics tooling.

Enterprise Deployment Patterns Agent and Tool Ecosystem LeRobot Isaac GR00T N1.5 NVIDIA +2 more

9Openai Blog·1mo ago·source ↗

OpenAI Announces $110B Investment Round at $730B Valuation

OpenAI has announced a $110 billion investment round at a $730 billion pre-money valuation. The round includes $30B from SoftBank, $30B from NVIDIA, and $50B from Amazon. This represents one of the largest private funding rounds in history and significantly increases OpenAI's capitalization for scaling AI infrastructure and development.

Training Infrastructure Frontier Model Releases Amazon NVIDIA OpenAI +3 more

5Hugging Face Blog·28d ago·source ↗

Towards Speed-of-Light Text Generation with Nemotron-Labs Diffusion Language Models

NVIDIA's Nemotron-Labs introduces diffusion-based language models targeting extremely fast text generation, published as a Hugging Face blog post. The piece covers the approach of using diffusion processes for language modeling as an alternative to autoregressive generation, with a focus on inference speed. This represents a continued push by NVIDIA's research arm into non-autoregressive generation paradigms.

Frontier Model Releases Inference Economics Diffusion Language Models NVIDIA Hugging Face +3 more

6The Batch·19d ago·source ↗

Data Points: Nvidia Ising Models for Quantum Computing, Meta Muse Spark, GitHub Rubber Duck, Anthropic Claude Managed Agents, GPT-5.4-Cyber

Nvidia released Ising, a family of open AI models targeting quantum processor calibration and error correction, achieving 2.5x faster and 3x more accurate decoding than pyMatching, with adoption by Fermilab, Harvard, and others. Meta announced Muse Spark, a small multimodal model powering a new AI assistant series for its apps and glasses. GitHub introduced Rubber Duck, a cross-model review feature pairing Claude with GPT-5.4 for two-pass coding agent validation. Anthropic launched Claude Managed Agents, a managed infrastructure platform for enterprise autonomous AI deployment, while OpenAI expanded its Trusted Access for Cyber program with GPT-5.4-Cyber, a fine-tuned defensive cybersecurity model.

Frontier Model Releases Inference Economics Rubber Duck Notion GPT-5.5-Cyber +22 more

7Mistral Ai News·19d ago·source ↗

Mistral NeMo: 12B Open-Weights Model with 128k Context, Built with NVIDIA

Mistral AI and NVIDIA jointly release Mistral NeMo, a 12B parameter model under Apache 2.0 license featuring a 128k token context window and a new tokenizer called Tekken based on Tiktoken. The model is designed as a drop-in replacement for Mistral 7B, supports multilingual applications across 11+ languages, and was trained with quantization awareness enabling FP8 inference without performance loss. Benchmark comparisons show competitive performance against Gemma 2 9B and Llama 3 8B. Weights are available on HuggingFace and the model is also packaged as an NVIDIA NIM inference microservice.

Long Context Evolution Frontier Model Releases Mistral AI Gemma 2 9B Apache 2.0 +9 more

6The Batch·17d ago·source ↗

Data Points: Perplexity Computer expands, Google Aletheia math agent, DeepSeek chip strategy, Nvidia retrieval pipeline, Stargate cancellation

The Batch's weekly data points roundup covers five significant AI developments: Perplexity expanded its Computer agentic platform to desktop, mobile, and enterprise with new APIs and financial data tools; Google released Aletheia, a Gemini-based math research agent achieving 95.1% on IMO-Proof Bench Advanced (up from 65.7%); DeepSeek withheld pre-release access to its V4 model from Nvidia and AMD while giving domestic Chinese chipmakers early access; Nvidia's NeMo Retriever topped the ViDoRe v3 leaderboard using a ReACT-based agentic retrieval loop; and OpenAI and Oracle cancelled plans to expand the Abilene Stargate campus from 1.2 GW to 2.0 GW due to financing and reliability issues.

Training Infrastructure Frontier Model Releases ViDoRe v3 Crusoe BRIGHT +19 more

5Github Trending·10d ago·source ↗

NVIDIA releases SkillSpector: security scanner for AI agent skills

NVIDIA has published SkillSpector, an open-source Python tool for scanning AI agent skills to detect vulnerabilities, malicious patterns, and security risks. The repository is trending on GitHub with 1,920 total stars and 280 added today. The tool addresses an emerging security concern as agentic AI systems proliferate and third-party skill/tool ecosystems grow.

AI Safety Research Agent and Tool Ecosystem NVIDIA SkillSpector

4Github Trending·7d ago·source ↗

NVIDIA PhysicsNeMo: open-source Physics-ML deep learning framework

NVIDIA has published PhysicsNeMo, an open-source Python framework for building, training, and fine-tuning deep learning models using Physics-ML methods. The repository has accumulated 2,933 stars on GitHub. Physics-informed ML is a growing area relevant to scientific computing and simulation workloads.

Agent and Tool Ecosystem PhysicsNeMo NVIDIA

5Openai Blog·1mo ago·source ↗

How NVIDIA Engineers and Researchers Build with Codex

OpenAI published a case study describing how NVIDIA teams use Codex powered by GPT-5.5 to ship production systems and accelerate research experimentation. The piece highlights enterprise adoption of Codex as a coding agent in a major hardware/AI lab context. It signals continued real-world deployment of OpenAI's agentic coding tools at scale.

Frontier Model Releases Enterprise Deployment Patterns NVIDIA OpenAI Codex +2 more

5Interconnects·1mo ago·source ↗

Latest open artifacts (#20): New orgs! New types of models! With Nemotron Super, Sarvam, Cohere Transcribe, & others

Interconnects' recurring open-weights roundup covers several new model releases and organizations entering the open-artifact space. Highlighted items include Nvidia's Nemotron Super, Indian AI lab Sarvam, and Cohere's Transcribe product. The piece tracks the expanding diversity of organizations and model types contributing to the open-weights ecosystem.

Open Weights Progress Agent and Tool Ecosystem Interconnects Cohere Transcribe NVIDIA +3 more

8Mistral Ai News·1mo ago·source ↗

Mistral Releases Mistral 3 Family: Mistral Large 3 (675B MoE) and Ministral 3 Series (3B–14B), All Apache 2.0

Mistral AI has announced Mistral 3, a family of open-weight models including Mistral Large 3 (41B active / 675B total sparse MoE) and three dense Ministral 3 edge models (3B, 8B, 14B), all released under Apache 2.0. Mistral Large 3 debuts at #2 on LMArena's OSS non-reasoning leaderboard, supports image understanding, and was trained on 3,000 NVIDIA H200 GPUs; a reasoning variant is forthcoming. The Ministral 3 series includes base, instruct, and reasoning variants with multimodal and multilingual capabilities, with the 14B reasoning model achieving 85% on AIME '25. The release involves deep co-optimization with NVIDIA (Blackwell/Hopper kernels, NVFP4 format), vLLM, and Red Hat, and is available across major cloud and inference platforms.

Training Infrastructure Frontier Model Releases Mistral AI Amazon Bedrock Red Hat +16 more

5Hugging Face Blog·1mo ago·source ↗

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

This Hugging Face blog post details a workflow for fine-tuning NVIDIA's Cosmos Predict 2.5 world model using LoRA and DoRA parameter-efficient techniques for robot video generation tasks. The post covers practical implementation steps for adapting the foundation video model to robotics-specific domains. This represents a concrete application of world models to embodied AI, where synthetic video generation can support robot training data pipelines.

Inference Economics Agent and Tool Ecosystem DoRA LoRA NVIDIA +3 more

4Hugging Face Blog·1mo ago·source ↗

Controlling Language Model Generation with NVIDIA's LogitsProcessorZoo

NVIDIA's LogitsProcessorZoo is a library providing a collection of logits processors for fine-grained control over language model text generation. The blog post, published on Hugging Face, covers how these processors can constrain, guide, or modify token sampling distributions at inference time. This tooling is relevant for applications requiring structured outputs, constrained decoding, or specialized generation behaviors without retraining.

Inference Economics Agent and Tool Ecosystem LogitsProcessorZoo NVIDIA Hugging Face

5Hugging Face Blog·1mo ago·source ↗

Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac

A Hugging Face blog post describes a project combining LeRobot and NVIDIA Isaac to develop a healthcare robot, covering the pipeline from simulation to real-world deployment. The post likely details how reinforcement learning or imitation learning techniques are applied in a medical robotics context. This represents a practical application of sim-to-real transfer methods in a high-stakes domain.

Enterprise Deployment Patterns Agent and Tool Ecosystem LeRobot NVIDIA NVIDIA Isaac +1 more

4Hugging Face Blog·1mo ago·source ↗

Nemotron-Personas-India: Synthesized Data for Sovereign AI

NVIDIA and Hugging Face have released Nemotron-Personas-India, a synthetic dataset designed to support sovereign AI development in India. The dataset consists of synthesized persona data intended to improve AI model performance for Indian languages, cultures, and contexts. This release reflects growing interest in localized, culturally-grounded training data as a foundation for regional AI sovereignty initiatives.

Enterprise Deployment Patterns Agent and Tool Ecosystem Nemotron-Personas-India Sovereign AI NVIDIA +1 more

5Hugging Face Blog·1mo ago·source ↗

Accelerate a World of LLMs on Hugging Face with NVIDIA NIM

NVIDIA NIM microservices are being integrated with Hugging Face to enable optimized inference deployment for a broad range of LLMs hosted on the Hub. The partnership allows developers to deploy Hugging Face models via NIM's containerized inference stack, leveraging NVIDIA's TensorRT-LLM and other optimizations. This expands the ecosystem of models accessible through NIM beyond NVIDIA's own catalog to the wider Hugging Face model repository.

Inference Economics Enterprise Deployment Patterns NVIDIA NIM NVIDIA TensorRT-LLM +2 more

5Hugging Face Blog·1mo ago·source ↗

Easily Train Models with H100 GPUs on NVIDIA DGX Cloud

Hugging Face announced integration with NVIDIA DGX Cloud, enabling users to train models on H100 GPU clusters directly through the Hugging Face platform. The partnership simplifies access to high-end training infrastructure without requiring users to manage cloud provisioning themselves. This represents a continued push to lower the barrier to large-scale model training for the broader ML community.

Training Infrastructure Inference Economics NVIDIA NVIDIA DGX Cloud H100 +2 more

5Hugging Face Blog·1mo ago·source ↗

Optimum-NVIDIA: One-Line LLM Inference Acceleration via TensorRT-LLM

Hugging Face's Optimum-NVIDIA integration wraps NVIDIA's TensorRT-LLM backend to enable high-performance LLM inference with minimal code changes. The library targets developers who want near-peak GPU throughput without manually configuring TensorRT-LLM pipelines. It positions as a bridge between the Hugging Face ecosystem and NVIDIA's optimized inference stack.

Inference Economics Enterprise Deployment Patterns NVIDIA TensorRT-LLM Optimum-NVIDIA +2 more

7The Batch·19d ago·source ↗

Data Points: Qwen3.7-Max, OpenAI Math Proof, Gated DeltaNet-2, Trump AI Order, Microsoft Fara1.5

This edition of The Batch covers five significant AI developments: Alibaba's Qwen3.7-Max reasoning model with 1M token context and agentic capabilities ranking fifth on the Artificial Analysis Intelligence Index; an OpenAI reasoning model resolving the 80-year-old Erdős planar unit distance problem; Nvidia's Gated DeltaNet-2 outperforming Mamba-3 and other linear attention architectures; Trump pulling back a proposed AI regulation executive order; and Microsoft Research's Fara1.5 computer-use agent family beating OpenAI Operator and Google Gemini on the Online-Mind2Web benchmark.

Long Context Evolution Frontier Model Releases Paul Erdős Fara1.5 Mamba +25 more

7The Batch·19d ago·source ↗

Claude Opus 4.8 Launches with Improved Honesty; Anthropic Previews Mythos-Class Models and Dynamic Workflows

Anthropic released Claude Opus 4.8 with improvements in coding, reasoning, agentic tasks, and notably better uncertainty flagging—approximately four times less likely than Opus 4.7 to let code flaws pass uncommented. Alongside the model, Anthropic introduced dynamic workflows in Claude Code enabling tens to hundreds of parallel subagents for large-scale engineering tasks, an effort-control slider, and a 3x price cut on fast mode. Anthropic also previewed Mythos-class models, positioned above Opus in capability, currently available to a limited set of organizations for cybersecurity work pending broader safety clearance. The same digest covers MiniMax M3 (open-weights, ~60% SWE-Bench Pro), Nvidia's RTX Spark superchip, Cosmos 3 world model, and a GR00T/Unitree robotics partnership.

Frontier Model Releases Evaluation and Benchmarking Unitree Harvey Claude Mythos +16 more

7The Batch·19d ago·source ↗

Data Points: China Blocks Meta-Manus Deal; Microsoft-OpenAI Restructure; Nvidia Nemotron Omni; Grok 4.3; OpenAI AGI Principles; IBM Granite 4.1

A roundup of major AI developments: Chinese regulators blocked Meta's acquisition of Singapore-based agent startup Manus on security grounds; Microsoft and OpenAI restructured their partnership, with OpenAI gaining freedom to sell on rival clouds while Microsoft loses its AGI-access clause; Nvidia released Nemotron 3 Nano Omni, a 30B MoE omnimodal open-weights model for local agent deployment; xAI shipped Grok 4.3 with a 1M-token context window at reduced pricing; OpenAI published AGI operating principles; and IBM released Granite 4.1 across language, vision, speech, embedding, and safety modalities.

Long Context Evolution Frontier Model Releases Palantir IBM Microsoft +17 more

7Mistral Ai News·19d ago·source ↗

Mistral AI Launches Mistral Compute: Sovereign AI Infrastructure Offering

Mistral AI has announced Mistral Compute, a new AI infrastructure product offering customers a private, integrated stack spanning bare-metal GPUs, orchestration, APIs, and managed PaaS services. Positioned as a European alternative to US and Chinese cloud providers, it targets nation-states, enterprises, and research institutions seeking data sovereignty and independent AI infrastructure. The offering is built on NVIDIA hardware with tens of thousands of GPUs available, and includes Mistral's training suite for domain-specific model development. Launch partners include BNP Paribas, Orange, Thales, and several other European enterprises.

Training Infrastructure Inference Economics Mistral AI Black Forest Labs Thales +8 more

6The Batch·18d ago·source ↗

DeepSeek withholds DeepSeek-V4 pre-release access from Nvidia and AMD, shares with Huawei

DeepSeek has given Huawei several weeks of pre-release access to its upcoming DeepSeek-V4 model for hardware optimization, while denying the same access to Nvidia and AMD — a departure from prior practice. Reuters also reported that an unnamed Trump administration official claims DeepSeek-V4 was trained on Nvidia's most advanced chips despite U.S. export controls, though the sourcing is unverified. The move signals deepening geopolitical fragmentation in AI supply chains and aligns with China's push for domestic chip self-sufficiency. DeepSeek-V4 has not yet been publicly released.

Frontier Model Releases Open Weights Progress Reuters DeepSeek V4 NVIDIA +3 more

3Github Trending·17d ago·source ↗

NVIDIA NeMo Gym: framework for evaluating and improving models and agents using environments

NVIDIA's NeMo team has published a Python library called NeMo Gym on GitHub, designed to evaluate and improve models and agents through environment-based interaction. The repository has 941 stars with minimal recent traction (+1 today). It appears to be an RL-style evaluation and training harness within the NeMo ecosystem.

Evaluation and Benchmarking Agent and Tool Ecosystem NVIDIA NeMo Gym

4Hugging Face Blog·1mo ago·source ↗

Build a Domain-Specific Embedding Model in Under a Day

A Hugging Face blog post (co-authored with NVIDIA) describes a workflow for fine-tuning domain-specific embedding models rapidly, targeting practitioners who need specialized retrieval or semantic search capabilities. The post likely covers data preparation, fine-tuning techniques, and evaluation for embedding models tailored to specific domains. Published on the Hugging Face blog with NVIDIA involvement, it represents a practical guide for enterprise or research deployment of custom embeddings.

Enterprise Deployment Patterns Agent and Tool Ecosystem NVIDIA Hugging Face