Unsloth: Web UI and Library for Efficient Fine-tuning of Open Models
Unsloth is an open-source Python library and web UI (Unsloth Studio) for efficient fine-tuning and local inference of open-weight models including Gemma 4, Qwen3, DeepSeek, and GPT-OSS variants. The project has accumulated over 64,000 GitHub stars with continued daily growth (+139 today), indicating strong community adoption. It targets practitioners who want to train and run large models locally with reduced memory and compute requirements.
Related guides (4)
Related events (8)
Make LLM Fine-tuning 2x faster with Unsloth and 🤗 TRL
Hugging Face published a blog post detailing an integration between Unsloth and TRL (Transformer Reinforcement Learning) library that claims to achieve 2x faster LLM fine-tuning. The post covers how Unsloth optimizes training kernels to reduce memory usage and increase throughput. This is relevant to practitioners looking to reduce compute costs and time for fine-tuning large language models.
Train AI Models with Unsloth and Hugging Face Jobs for Free
Hugging Face has published a blog post describing how to use Unsloth in combination with Hugging Face Jobs to fine-tune AI models at no cost. The post targets practitioners looking for accessible, low-cost training workflows. It highlights the integration between Unsloth's memory-efficient training optimizations and Hugging Face's job execution infrastructure.
Open Interpreter: lightweight coding agent for open models (Deepseek, Kimi, Qwen)
Open Interpreter is an open-source Python coding agent framework supporting open-weight models including Deepseek, Kimi, and Qwen. The project has accumulated nearly 64,000 GitHub stars, with 45 new stars on the trending day. It provides a lightweight harness for running code-executing agents on locally-hosted or open models.
ms-swift: ModelScope framework for fine-tuning 600+ LLMs and 300+ MLLMs
ms-swift is an open-source Python framework from ModelScope supporting PEFT and full-parameter fine-tuning methods (CPT, SFT, DPO, GRPO) across 600+ LLMs and 300+ multimodal LLMs, including Qwen3, DeepSeek, Llama4, and others. The project has accumulated 14,487 GitHub stars and was accepted at AAAI 2025. It serves as a broad-coverage training harness for the current generation of open-weights frontier models.
Mistral Small 3: 24B Latency-Optimized Open-Weight Model Released Under Apache 2.0
Mistral AI has released Mistral Small 3, a 24B-parameter instruction-tuned model optimized for low latency, achieving over 81% on MMLU at 150 tokens/s on a single GPU. The model is competitive with Llama 3.3 70B and Qwen 32B while being more than 3x faster on equivalent hardware, and is released under Apache 2.0 for both pretrained and instruction-tuned checkpoints. It is explicitly not trained with RL or synthetic data, positioning it as a base model for community fine-tuning and reasoning capability development. Deployment targets include local inference on consumer hardware (RTX 4090, MacBook 32GB RAM), agentic function calling, and domain-specific fine-tuning.
Qwen3.5 Small tops mobile-sized open models; GPT-5.3 Instant, Gemini 3.1 Flash-Lite, Claude memory import, and LLM deanonymization research
Alibaba released the Qwen3.5 Small model series (0.8B–9B parameters) with a hybrid Gated Delta Networks + sparse MoE architecture, with the 9B model outperforming OpenAI's gpt-oss-120B on GPQA Diamond despite being 13.5x smaller; all weights are Apache 2.0 licensed. Google introduced Gemini 3.1 Flash-Lite, a cost-optimized model at $0.25/M input tokens with 2.5x faster TTFT than Gemini 2.5 Flash. OpenAI released GPT-5.3 Instant targeting conversational quality improvements and hallucination reduction, while Anthropic added memory import/export functionality across all Claude tiers. Separately, researchers from MATS, Anthropic, and ETH Zurich demonstrated that LLM-based pipelines can deanonymize pseudonymous online users at 68% recall/90% precision for $1–4 per profile.
THUDM releases slime: RL scaling post-training framework for LLMs
THUDM (Tsinghua University's Knowledge Engineering Group) has released slime, an open-source Python framework for LLM post-training via reinforcement learning scaling. The repository has accumulated 6,548 stars with 195 added in a single day, indicating significant community interest. RL-based post-training frameworks are a key area of active development following the success of techniques like GRPO and PPO in improving reasoning capabilities.
Fine-tuning Florence-2 - Microsoft's Cutting-edge Vision Language Models
This Hugging Face blog post provides a technical guide for fine-tuning Microsoft's Florence-2 vision-language models. Florence-2 is a compact yet capable multimodal model supporting tasks like captioning, object detection, and OCR. The post covers practical implementation details for adapting the model to custom datasets using the Hugging Face ecosystem.



