5GitHub Trending (AI/LLM filtered)·8d ago

ms-swift: ModelScope framework for fine-tuning 600+ LLMs and 300+ MLLMs

ms-swift is an open-source Python framework from ModelScope supporting PEFT and full-parameter fine-tuning methods (CPT, SFT, DPO, GRPO) across 600+ LLMs and 300+ multimodal LLMs, including Qwen3, DeepSeek, Llama4, and others. The project has accumulated 14,487 GitHub stars and was accepted at AAAI 2025. It serves as a broad-coverage training harness for the current generation of open-weights frontier models.

Open Weights Progress Agent and Tool Ecosystem ms-swift GRPO DPO Llama ModelScope Qwen3

Related guides (3)

GRPOConcept

GRPO: The Lightweight RL Trick Behind Today's Reasoning Models

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner In-depth

Agent and Tool EcosystemTopic guide

Agent and Tool Ecosystem: How AI Is Learning to Act, Not Just Answer

Read asBeginner In-depth

Related events (8)

4Github Trending·1mo ago·source ↗

Unsloth: Web UI and Library for Efficient Fine-tuning of Open Models

Unsloth is an open-source Python library and web UI (Unsloth Studio) for efficient fine-tuning and local inference of open-weight models including Gemma 4, Qwen3, DeepSeek, and GPT-OSS variants. The project has accumulated over 64,000 GitHub stars with continued daily growth (+139 today), indicating strong community adoption. It targets practitioners who want to train and run large models locally with reduced memory and compute requirements.

Open Weights Progress Inference Economics DeepSeek V4 Unsloth unslothai +3 more

3Github Trending·8d ago·source ↗

mlx-lm: LLM inference library for Apple MLX framework trending on GitHub

mlx-lm is an open-source Python library for running LLMs using Apple's MLX framework, designed for Apple Silicon hardware. The repository has accumulated 5,817 stars with 43 new stars today, indicating steady community interest. It represents a key piece of the Apple-native ML inference ecosystem.

Inference Economics mlx-lm Apple MLX

6Hugging Face Blog·1mo ago·source ↗

Parameter-Efficient Fine-Tuning using 🤗 PEFT

Hugging Face introduces the PEFT library, which enables parameter-efficient fine-tuning of large language models using techniques such as LoRA, prefix tuning, and prompt tuning. The library allows practitioners to adapt large pretrained models to downstream tasks while updating only a small fraction of model parameters, dramatically reducing compute and memory requirements. This lowers the barrier to fine-tuning frontier-scale models on consumer hardware.

Open Weights Progress Inference Economics PEFT LoRA Hugging Face +4 more

7arXiv · cs.CL·17d ago·source ↗

PROVE framework trains LLMs for multi-step tool use via stateful MCP environments and programmatic rewards

Researchers introduce PROVE (Programmatic Rewards On Verified Environments), a framework for training LLMs to orchestrate multi-step tool calls using reinforcement learning. The system includes a library of 20 stateful MCP servers with 343 tools, an automated data synthesis pipeline that grounds training queries in live server state, and a multi-component programmatic reward function requiring no judge model. Training four models (Qwen3-4B, Qwen3-8B, Qwen2.5-7B, Granite-4.1-8B) with ~13K examples yields gains of up to +10.2 on BFCL Multi-Turn, +6.8 on tau2-bench, and +6.5 on T-Eval, demonstrating consistent improvements in multi-step tool orchestration.

Evaluation and Benchmarking Agent and Tool Ecosystem Qwen2.5-7B GRPO Qwen3-4B +7 more

7Qwen Research·1mo ago·source ↗

Qwen2.5-Math: Open-Source Mathematical LLM Series Released

Alibaba's Qwen team has released Qwen2.5-Math, an upgraded series of open-source mathematical LLMs including base and instruction-tuned models at 1.5B, 7B, and 72B parameter scales, plus a mathematical reward model. The models support Chain-of-Thought (CoT) and Tool-Integrated Reasoning (TIR) for English and Chinese math problem solving. This follows the Qwen2-Math release approximately one month prior and is claimed to be the leading open-source mathematical LLM series.

Frontier Model Releases Evaluation and Benchmarking Tool-Integrated Reasoning Chain-of-Thought Reasoning Qwen2.5-Math-PRM +2 more

7arXiv · cs.CL·18d ago·source ↗

On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters

This paper reframes parameter-efficient fine-tuning (PEFT) not merely as a cheaper alternative to full fine-tuning, but as a substrate for persistent, instance-specific personal models layered atop shared foundation models. The authors analyze three scaling axes: Scale Up (stronger base models amplifying adapter utility), Scale Down (minimum viable adapter size), and Scale Out (managing millions of concurrent adapted instances). They introduce MinT as an infrastructure reference for adapter identity, versioning, provenance, evaluation, and serving at scale.

Training Infrastructure Inference Economics LoRA Parameter-Efficient Fine-Tuning MinT +2 more

3Github Trending·1mo ago·source ↗

vLLM: High-Throughput LLM Inference and Serving Engine Trending on GitHub

vLLM is an open-source Python library providing high-throughput and memory-efficient inference and serving for large language models. The project has accumulated over 80,500 GitHub stars with 98 new stars today, indicating continued strong community interest. It is a widely adopted inference backend in the AI/ML ecosystem, supporting PagedAttention and various optimization techniques for LLM deployment.

Inference Economics Agent and Tool Ecosystem vllm-project vLLM

7Qwen Research·1mo ago·source ↗

Qwen2.5-Max: Large-Scale MoE Model Release by Alibaba's Qwen Team

Alibaba's Qwen team announces Qwen2.5-Max, a large-scale Mixture-of-Experts language model. The post acknowledges that scaling insights for very large MoE models have been limited, citing DeepSeek V3's recent disclosures as a reference point. The model is positioned as a frontier-scale MoE system developed concurrently with ongoing Qwen2 research.

Training Infrastructure Frontier Model Releases DeepSeek V4 Alibaba Qwen Team +3 more