torchtune: PyTorch Native Post-Training Library for LLMs
Meta's PyTorch team introduces torchtune, a PyTorch-native library for post-training LLMs that emphasizes modularity, hackability, and direct access to underlying PyTorch components. The library supports fine-tuning, experimentation, and deployment-oriented workflows across distributed training settings. Benchmarked against popular frameworks Axolotl and Unsloth, torchtune demonstrates competitive performance and memory efficiency while maintaining flexibility for research iteration. The paper presents design principles, model builders, training recipes, and distributed training stack details.
Related guides (4)

Open Weights ProgressTopic guide
Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Agent and Tool EcosystemTopic guide
Agent and Tool Ecosystem: How the Infrastructure Layer Around LLMs Is Consolidating
Related events (8)
Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL
Hugging Face published a blog post detailing an integration between Unsloth and TRL (Transformer Reinforcement Learning) library that claims to achieve 2x faster LLM fine-tuning. The post covers how Unsloth optimizes training kernels to reduce memory usage and increase throughput. This is relevant to practitioners looking to reduce compute costs and time for fine-tuning large language models.
THUDM releases slime: RL scaling post-training framework for LLMs
THUDM (Tsinghua University's Knowledge Engineering Group) has released slime, an open-source Python framework for LLM post-training via reinforcement learning scaling. The repository has accumulated 6,548 stars with 195 added in a single day, indicating significant community interest. RL-based post-training frameworks are a key area of active development following the success of techniques like GRPO and PPO in improving reasoning capabilities.
Optimum-NVIDIA: One-Line LLM Inference Acceleration via TensorRT-LLM
Hugging Face's Optimum-NVIDIA integration wraps NVIDIA's TensorRT-LLM backend to enable high-performance LLM inference with minimal code changes. The library targets developers who want near-peak GPU throughput without manually configuring TensorRT-LLM pipelines. It positions as a bridge between the Hugging Face ecosystem and NVIDIA's optimized inference stack.
Optimizing your LLM in production
A Hugging Face blog post covering practical techniques for optimizing large language models in production environments. The post likely addresses inference efficiency methods such as quantization, batching, caching, and hardware utilization strategies. It serves as a practitioner-oriented guide for deploying LLMs at scale.
Preference Tuning LLMs with Direct Preference Optimization Methods
A Hugging Face blog post surveys Direct Preference Optimization (DPO) and related preference tuning methods for aligning large language models. The post covers the landscape of DPO variants and their practical application via the TRL library. It serves as a technical reference for practitioners implementing RLHF alternatives.
HullFT: Efficient Test-Time Finetuning of LLMs via Convex Reconstruction and Gradient Caching
HullFT is a new method for test-time finetuning (TTFT) of language models that addresses the dual bottlenecks of retrieval quality and per-query finetuning cost. It represents query embeddings as sparse convex combinations of training sequences using Frank-Wolfe optimization, yielding diverse and relevant support sets without expensive diversity-aware search. A geometric integerization step converts fractional weights into integer multiplicities, enabling a Gradient Reuse scheme that amortizes forward-backward computation across repeated examples. Experiments show improved quality-efficiency tradeoffs over prior TTFT methods, measured in bits-per-byte at lower total runtime.
Prism: Plug-in Infrastructure for Multimodal Continual Instruction Tuning Research
Prism is an open-source codebase designed to address engineering bottlenecks in Multimodal Continual Instruction Tuning (MCIT) research. It introduces a plugin registration mechanism that separates algorithmic development from backbone MLLM implementation, allowing new continual learning strategies to be integrated without modifying the underlying model codebase. This design aims to eliminate structural fragmentation across method-specific implementations and enable fair, reproducible comparisons at scale.
TRL v1.0: Post-Training Library Built to Move with the Field
Hugging Face has released TRL v1.0, a major milestone for its post-training library focused on reinforcement learning from human feedback and related alignment techniques. The release signals a stabilization of the API and feature set after iterative development tracking the rapidly evolving post-training landscape. TRL is widely used in the open-source community for fine-tuning and aligning language models using methods such as PPO, DPO, and GRPO.

