5Hugging Face Blog·1mo ago

Fine-Tuning NVIDIA Cosmos Predict 2.5 with LoRA/DoRA for Robot Video Generation

This Hugging Face blog post details a workflow for fine-tuning NVIDIA's Cosmos Predict 2.5 world model using LoRA and DoRA parameter-efficient techniques for robot video generation tasks. The post covers practical implementation steps for adapting the foundation video model to robotics-specific domains. This represents a concrete application of world models to embodied AI, where synthetic video generation can support robot training data pipelines.

Inference Economics Agent and Tool Ecosystem Multimodal Progress DoRA LoRA NVIDIA Hugging Face NVIDIA Cosmos Predict 2.5

Related guides (3)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

LoRAConcept

LoRA: How to Teach a Giant AI New Tricks Without Rebuilding It

Read asBeginner In-depth

NVIDIA

NVIDIA: The Hardware Backbone of the AI Era

Read asBeginner In-depth

Related events (8)

5Hugging Face Blog·1mo ago·source ↗

(LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware

This Hugging Face blog post covers techniques for fine-tuning the FLUX.1-dev image generation model using LoRA (Low-Rank Adaptation) on consumer-grade hardware. The post likely addresses quantization strategies (QLoRA) to reduce memory requirements, enabling training on GPUs with limited VRAM. This is relevant to the open-weights and accessible fine-tuning ecosystem for diffusion models.

Open Weights Progress Inference Economics Black Forest Labs FLUX.1-dev LoRA +3 more

4Hugging Face Blog·1mo ago·source ↗

LoRA Training Scripts of the World, Unite!

Hugging Face published a blog post consolidating and comparing advanced LoRA fine-tuning scripts for Stable Diffusion XL, covering techniques such as pivotal tuning, custom captions, and various regularization strategies. The post aims to unify fragmented community training approaches into a more coherent set of best practices. It serves as a practical guide for practitioners fine-tuning SDXL models with LoRA adapters.

Open Weights Progress Agent and Tool Ecosystem LoRA Stable Diffusion 3 Pivotal Tuning +2 more

5Hugging Face Blog·1mo ago·source ↗

Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models

This Hugging Face blog post provides a technical guide for fine-tuning Microsoft's Florence-2 vision-language models. Florence-2 is a compact yet capable multimodal model supporting tasks like captioning, object detection, and OCR. The post covers practical implementation details for adapting the model to custom datasets using the Hugging Face ecosystem.

Enterprise Deployment Patterns Agent and Tool Ecosystem Microsoft Hugging Face Florence-2 +1 more

5Hugging Face Blog·1mo ago·source ↗

Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine-Tuning, and On-Device Optimizations

NXP and Hugging Face describe a pipeline for deploying Vision-Language-Action (VLA) models on embedded/edge hardware, covering dataset recording, fine-tuning, and on-device optimization techniques. The post targets robotics applications where inference must run on resource-constrained microcontrollers or SoCs rather than cloud GPUs. Key topics include quantization, model compression, and integration with the LeRobot ecosystem. This represents a practical engineering bridge between frontier VLA research and real-world embedded robotics deployment.

Inference Economics Agent and Tool Ecosystem LeRobot NXP Semiconductors Vision-Language-Action model +3 more

5Hugging Face Blog·1mo ago·source ↗

Using LoRA for Efficient Stable Diffusion Fine-Tuning

This Hugging Face blog post explains how Low-Rank Adaptation (LoRA) can be applied to fine-tune Stable Diffusion models efficiently. LoRA reduces the number of trainable parameters by decomposing weight updates into low-rank matrices, enabling fine-tuning on consumer hardware with significantly less memory. The post covers practical implementation details using the diffusers library.

Open Weights Progress Agent and Tool Ecosystem LoRA Stable Diffusion 3 Hugging Face +2 more

7Hugging Face Blog·19d ago·source ↗

Welcome NVIDIA Cosmos 3: The First Open Omni-model for Physical AI Reasoning and Action

NVIDIA has released Cosmos 3, described as the first open omni-model targeting physical AI reasoning and action. The model is hosted and announced via Hugging Face, positioning it as an open-weights offering for robotics and embodied AI applications. The announcement highlights multimodal capabilities oriented toward physical world understanding and agent-level action.

Frontier Model Releases Open Weights Progress NVIDIA Cosmos NVIDIA Hugging Face +2 more

5Hugging Face Blog·1mo ago·source ↗

We Got Claude to Fine-Tune an Open Source LLM

Hugging Face demonstrates using Claude (Anthropic's model) as an orchestrating agent to autonomously fine-tune an open-source LLM, showcasing an agentic workflow for model training. The post illustrates how a frontier model can handle the end-to-end process of dataset preparation, training configuration, and execution for a smaller open-weights model. This represents a practical example of AI-assisted ML engineering and agent-tool ecosystem development.

Open Weights Progress Agent and Tool Ecosystem Claude Hugging Face Anthropic

5Hugging Face Blog·1mo ago·source ↗

Post-Training Isaac GR00T N1.5 for LeRobot SO-101 Arm

NVIDIA and Hugging Face demonstrate fine-tuning of the Isaac GR00T N1.5 robot foundation model on the SO-101 robotic arm using the LeRobot framework. The post covers post-training methodology to adapt the generalist robot policy to a specific hardware platform. This represents a practical integration between NVIDIA's robotics AI stack and Hugging Face's open robotics tooling.

Enterprise Deployment Patterns Agent and Tool Ecosystem LeRobot Isaac GR00T N1.5 NVIDIA +2 more