NVIDIA Llama Nemotron Nano VLM Released on Hugging Face Hub
NVIDIA has released the Llama Nemotron Nano VLM on Hugging Face Hub, a compact vision-language model built on the Llama architecture. The model is part of NVIDIA's Nemotron family targeting efficient multimodal inference. This release makes the model accessible to the broader research and developer community through Hugging Face's model hosting infrastructure.
Related guides (4)
Related events (8)
Measuring Open-Source Llama Nemotron Models on DeepResearch Bench
NVIDIA evaluates its open-source Llama Nemotron models on the DeepResearch Bench, a benchmark designed to assess deep research agent capabilities. The post appears to report competitive performance of the Nemotron models in agentic research tasks. This is relevant to the ongoing development of open-weights models capable of multi-step research and reasoning workflows.
Accelerate a World of LLMs on Hugging Face with NVIDIA NIM
NVIDIA NIM microservices are being integrated with Hugging Face to enable optimized inference deployment for a broad range of LLMs hosted on the Hub. The partnership allows developers to deploy Hugging Face models via NIM's containerized inference stack, leveraging NVIDIA's TensorRT-LLM and other optimizations. This expands the ecosystem of models accessible through NIM beyond NVIDIA's own catalog to the wider Hugging Face model repository.
Llama 3.2 Multimodal and Edge Models Launch on Hugging Face
Meta released Llama 3.2, introducing vision-capable multimodal models alongside lightweight models optimized for on-device inference. Hugging Face published a blog post covering integration support, model availability, and deployment options across the ecosystem. The release marks Meta's first open-weights multimodal Llama models, adding image understanding to the Llama family. Smaller 1B and 3B parameter variants target edge and mobile deployment scenarios.
Introducing NVIDIA Nemotron 3 Nano Omni: Long-Context Multimodal Intelligence for Documents, Audio and Video Agents
NVIDIA has released Nemotron 3 Nano Omni, a multimodal model targeting long-context understanding across documents, audio, and video modalities. The model is positioned for agentic use cases requiring cross-modal reasoning. It is published via the Hugging Face blog as part of NVIDIA's Nemotron model family. No detailed technical specifications or benchmark results are provided in the available body text.
Llama 3.2 in Keras
Hugging Face published a blog post detailing the integration of Meta's Llama 3.2 models into the Keras framework. The post covers how developers can use Keras to load, fine-tune, and run inference with Llama 3.2, expanding the ecosystem of tools available for working with the model. This represents a tooling/framework integration update rather than a new capability announcement.
Llama Guard 4 Released on Hugging Face Hub
Meta's Llama Guard 4 safety classifier has been made available on the Hugging Face Hub. Llama Guard 4 is a content moderation model designed to detect unsafe inputs and outputs in LLM pipelines. The Hugging Face blog post announces its availability and integration into the Hub ecosystem, continuing the Llama Guard series of safety-focused models.
Llama 2 is here - get it on Hugging Face
Meta released Llama 2, a new family of open-weights large language models, made available through Hugging Face. The release includes both base and fine-tuned chat variants across multiple parameter sizes. This represents a significant expansion of accessible open-weights frontier models, with Meta and Microsoft partnering on distribution.
Meta releases Llama 3.2 11B Vision multimodal model on Hugging Face
Meta released Llama 3.2 11B Vision, an open-weights image-text-to-text model, on Hugging Face. The model is part of the Llama 3.2 family and supports multiple languages including English, German, and French. This represents Meta's entry into open-weights multimodal models at the 11B parameter scale.



