Almanac
company

Qwen

companyactiveqwen-b28afe34·41 events·first seen 1mo ago

Aliases: Qwen

Co-occurring entities

More like this (12)

Recent events (41)

7Qwen Research·1mo ago·source ↗

GSPO: Group Sequence Policy Optimization for Scalable RL Training of Language Models

Qwen researchers introduce Group Sequence Policy Optimization (GSPO), a new RL algorithm designed to address severe training instability and model collapse observed in existing methods like GRPO during extended training runs. The core motivation is enabling stable RL scaling for language models to improve reasoning and problem-solving capabilities with increased compute. The paper targets a known bottleneck in post-training pipelines where instability prevents further performance gains.

8Qwen Research·1mo ago·source ↗

Qwen3-Coder: 480B MoE Agentic Coding Model Released by Alibaba/Qwen Team

Alibaba's Qwen team has released Qwen3-Coder, a family of code-focused models with the flagship variant being Qwen3-Coder-480B-A35B-Instruct, a 480B-parameter Mixture-of-Experts model with 35B active parameters. It supports 256K native context length and up to 1M tokens via extrapolation. The model claims state-of-the-art results among open-weight models on agentic coding, browser-use, and tool-use benchmarks, with performance described as comparable to Claude Sonnet 4.

7Qwen Research·1mo ago·source ↗

Qwen2.5-Omni: Alibaba Releases End-to-End Multimodal Model with Real-Time Streaming

Alibaba's Qwen team releases Qwen2.5-Omni, a 7B-parameter end-to-end multimodal model capable of processing text, images, audio, and video simultaneously. The model delivers real-time streaming responses in both text and natural speech synthesis. It is openly available on Hugging Face, ModelScope, DashScope, and GitHub, accompanied by a technical paper.

7Qwen Research·1mo ago·source ↗

QwQ-32B: Scaling Reinforcement Learning for Enhanced Reasoning

Alibaba's Qwen team releases QwQ-32B, a 32-billion parameter model trained with scaled Reinforcement Learning to improve reasoning capabilities beyond conventional pretraining and post-training methods. The release draws explicit comparison to DeepSeek R1's cold-start and multi-stage RL training approach. The model is available via Qwen Chat, Hugging Face, ModelScope, and a demo interface. This represents Qwen's exploration of RL scalability as a path to enhanced LLM intelligence.

6Qwen Research·1mo ago·source ↗

Global-batch Load Balancing for MoE LLM Training from Qwen

Qwen Research introduces a global-batch load balancing technique for Mixture-of-Experts (MoE) LLM training, claiming it is nearly a 'free lunch' improvement. The method addresses expert load imbalance across training batches, a known efficiency and quality bottleneck in MoE architectures. The approach targets the router and expert activation dynamics in transformer-based MoE layers.

7Qwen Research·1mo ago·source ↗

QwQ-32B-Preview: Alibaba's Qwen Reasoning Model with Deep Reflection Capabilities

Alibaba's Qwen team has released QwQ-32B-Preview, a 32-billion parameter model designed for deep reasoning across mathematics, code, and general knowledge. The model is positioned as a reasoning-focused system that emphasizes uncertainty and iterative questioning as core design principles. It is available on GitHub, Hugging Face, ModelScope, and via a demo interface.

4Qwen Research·1mo ago·source ↗

Introducing the Qwen Series: Overview of Alibaba's Open-Source LLM Journey

Alibaba's Qwen team published a retrospective introduction to the Qwen series of large language models, four months after the initial Qwen-7B open-source release. The post consolidates links to their paper, GitHub, Hugging Face, and ModelScope repositories, and outlines the team's objectives for the open-source LLM program. It serves as a canonical reference point for the Qwen model family's public positioning.

7Hacker News·27d ago·source ↗

Qwen3.7-Max: The Agent Frontier

Alibaba's Qwen team has announced Qwen3.7-Max, positioned as a frontier model for agentic tasks. The announcement appears on the official Qwen blog and generated significant community discussion on Hacker News with 559 points and 217 comments. The model name suggests it is part of the Qwen 3 generation, with a focus on agent capabilities.

5Qwen·11d ago·source ↗

Qwen releases Qwen-Image-Bench, a multimodal judge/evaluation model

Qwen has released Qwen-Image-Bench on Hugging Face, an image-text-to-text model tagged as a judge-model for evaluation and benchmarking purposes. The model supports both English and Chinese and appears designed to evaluate text-to-image outputs. With 8,572 downloads and 50 likes shortly after release, it has attracted modest early interest.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.6-27B multimodal model on Hugging Face

Qwen published Qwen3.6-27B, a 27-billion-parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 5.4 million downloads and 1,619 likes, it has seen substantial community uptake.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.6-35B-A3B multimodal MoE model on Hugging Face

Qwen published Qwen3.6-35B-A3B, a 35B-parameter mixture-of-experts image-text-to-text model with 3B active parameters, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 5.9 million downloads and 2,000 likes, it has seen substantial community uptake.

5Qwen·11d ago·source ↗

Qwen releases Qwen3.5-0.8B-Base multimodal model on Hugging Face

Qwen has released Qwen3.5-0.8B-Base, a small 0.8B parameter image-text-to-text base model on Hugging Face. The model supports conversational use and is compatible with Hugging Face endpoints. With nearly 200K downloads, it signals meaningful community uptake for a compact multimodal base model.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-2B-Base multimodal model on Hugging Face

Qwen released Qwen3.5-2B-Base, a 2-billion parameter base model supporting image-text-to-text tasks, on Hugging Face. The model is tagged as conversational and endpoints-compatible, suggesting deployment readiness. With nearly 180K downloads, it has seen significant early adoption in the open-weights community.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-4B-Base multimodal model on Hugging Face

Qwen has released Qwen3.5-4B-Base, a 4-billion parameter base model supporting image-text-to-text tasks, published on Hugging Face. The model is tagged as conversational and endpoints-compatible, using the safetensors format. With over 207,000 downloads, it represents a new entry in the Qwen3.5 model family with multimodal capabilities at a small parameter count.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-9B-Base multimodal model on Hugging Face

Qwen has released Qwen3.5-9B-Base, a 9-billion-parameter image-text-to-text base model on Hugging Face. The model supports conversational use and is compatible with the transformers library and inference endpoints. With over 153,000 downloads, it has seen substantial early adoption.

7Qwen·11d ago·source ↗

Qwen releases Qwen3.5-122B-A10B multimodal MoE model on Hugging Face

Qwen has released Qwen3.5-122B-A10B, a 122B-parameter mixture-of-experts image-text-to-text model with 10B active parameters, published on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. High download counts (840K) and likes (564) suggest rapid community uptake shortly after release.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-35B-A3B-Base multimodal MoE model on Hugging Face

Qwen has released Qwen3.5-35B-A3B-Base, a 35B-parameter mixture-of-experts image-text-to-text base model on Hugging Face, activating approximately 3B parameters per forward pass. The model supports conversational use and is compatible with Azure deployment endpoints. With over 109K downloads, it represents a notable open-weights multimodal MoE release from the Qwen team.

7Qwen·11d ago·source ↗

Qwen releases Qwen3.5-27B multimodal model on Hugging Face

Qwen has released Qwen3.5-27B, a 27-billion parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With nearly 3 million downloads and 981 likes, it has seen substantial community uptake.

7Qwen·11d ago·source ↗

Qwen releases Qwen3.5-35B-A3B multimodal MoE model on Hugging Face

Qwen has released Qwen3.5-35B-A3B, a 35B-parameter mixture-of-experts image-text-to-text model with approximately 3B active parameters, published on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 2.8 million downloads and 1,400+ likes, it has seen substantial community uptake.

5Qwen·11d ago·source ↗

Qwen releases Qwen3.5-0.8B multimodal model on Hugging Face

Alibaba's Qwen team released Qwen3.5-0.8B, a small-scale image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 2.7 million downloads and 562 likes, it has seen substantial community uptake for a sub-1B parameter multimodal model.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-2B multimodal model on Hugging Face

Alibaba's Qwen team released Qwen3.5-2B, a 2-billion-parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With nearly 2 million downloads, it has seen substantial community uptake.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-4B multimodal model on Hugging Face

Qwen has released Qwen3.5-4B, a 4-billion parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 10 million downloads and 604 likes, it has seen substantial community uptake.

6Qwen·11d ago·source ↗

Qwen releases Qwen3.5-9B multimodal model on Hugging Face

Qwen has released Qwen3.5-9B, a 9-billion parameter image-text-to-text model, on Hugging Face. The model supports conversational use cases and is compatible with Azure deployment endpoints. With over 9 million downloads and 1,500+ likes, it has seen substantial community uptake.

7Qwen Research·1mo ago·source ↗

Qwen2.5-1M: Open-Source Models with 1M Token Context Window Released

Alibaba's Qwen team has released two open-source models, Qwen2.5-7B-Instruct-1M and Qwen2.5-14B-Instruct-1M, extending context length to 1 million tokens. This follows the earlier upgrade of the proprietary Qwen2.5-Turbo to 1M context two months prior. The release includes inference framework support for deployment, marking the first time Qwen's open-weight models have reached this context length.

6Qwen Research·1mo ago·source ↗

Qwen2.5-Math Process Reward Model for Mathematical Reasoning Supervision

Alibaba's Qwen team introduces a process reward model (PRM) aimed at improving the reliability of mathematical reasoning in LLMs by supervising intermediate reasoning steps rather than only final answers. The work addresses the problem of models producing plausible but flawed intermediate derivations even when reaching correct conclusions. The release includes model weights on HuggingFace and ModelScope alongside a GitHub repository.

7Qwen Research·1mo ago·source ↗

QVQ-72B-Preview: Qwen Visual Reasoning Model Release

Alibaba's Qwen team has released QVQ-72B-Preview, a 72-billion parameter multimodal model designed to integrate visual understanding with advanced reasoning capabilities. The model is positioned as an extension of Qwen's language reasoning work into the visual domain. It is available on GitHub, Hugging Face, ModelScope, and Kaggle with a live demo.

5Qwen Research·1mo ago·source ↗

CodeQwen1.5: Alibaba's Open-Source Code LLM Release

Alibaba's Qwen team released CodeQwen1.5, an open-source large language model specialized for code generation and programming assistance. The release is positioned as a transparent, accessible alternative to proprietary coding assistants like GitHub Copilot, addressing concerns around cost, privacy, security, and copyright. The model is available on GitHub, HuggingFace, and ModelScope.

6Qwen Research·1mo ago·source ↗

Introducing Qwen-VL-Plus and Qwen-VL-Max: Upgraded Multimodal Models from Alibaba

Alibaba's Qwen team has launched two enhanced versions of their multimodal model, Qwen-VL-Plus and Qwen-VL-Max, building on the open-sourced Qwen-VL released in September 2023. Key improvements include substantially boosted image reasoning capabilities, enhanced detail recognition and text extraction from images, and support for high-definition images exceeding one million pixels across various aspect ratios. The upgrades represent a significant step forward in the Qwen-VL series' generalization and visual understanding capabilities.

4Qwen Research·1mo ago·source ↗

OFA: Towards Building a One-For-All Unified Multimodal Pretrained Model

Alibaba's Qwen team introduces OFA (One-For-All), a unified multimodal pretrained model designed to handle both understanding and generation tasks across multiple modalities within a single framework. The model is pretrained using instruction-based multitask pretraining to endow it with diverse capabilities. This work was published in late 2022 as part of the broader wave of generalist multimodal models. It represents an early effort toward a single model architecture capable of spanning vision, language, and cross-modal tasks.

6arXiv · cs.LG·14d ago·source ↗

Skill-RM: A unified reward model framework treating evaluation as an agentic skill

Researchers from the Qwen team propose Skill-RM, a framework that reformulates reward modeling as the execution of a reusable 'Reward-Evaluation Skill,' enabling a single model to orchestrate heterogeneous evaluation criteria including rule-based verifiers, ground-truth references, and rubrics. By treating reward computation as a structured agentic task, Skill-RM dynamically selects and aggregates evidence per input rather than relying on static evaluation. Experiments on reward benchmarks and downstream tasks (best-of-N selection, RL) show consistent improvements over traditional judge baselines. The code is publicly released under the Qwen-Applications GitHub organization.

6Qwen Research·1mo ago·source ↗

Qwen2-Audio: Multimodal Audio-Language Model Release

Alibaba's Qwen team releases Qwen2-Audio, the successor to Qwen-Audio, capable of accepting both audio and text inputs and generating text outputs. The model is positioned as a step toward AGI by extending large language model capabilities to audio modalities. It is released with accompanying paper, GitHub repository, and model weights on Hugging Face and ModelScope.

4Qwen Research·1mo ago·source ↗

OFASys: Multitask Multimodal Learning Framework from Alibaba/Qwen

Alibaba's Qwen team released OFASys, an open-source framework designed to simplify multimodal multitask learning, building on their earlier OFA unified pretrained model. The system aims to reduce engineering friction in setting up multi-task, multi-modal training pipelines, including data batching and training stability. It is positioned as infrastructure for building generalist AI models with minimal code overhead.

6arXiv · cs.CL·26d ago·source ↗

DelTA: Discriminative Token Credit Assignment for RLVR Training

DelTA introduces a discriminative token credit assignment method for reinforcement learning from verifiable rewards (RLVR) that addresses the problem of high-frequency formatting tokens dominating policy gradient updates. The method estimates per-token coefficients to amplify side-specific gradient directions and downweight shared or weakly discriminative ones, making the effective update direction more contrastive. On seven mathematical benchmarks, DelTA outperforms same-scale baselines by 3.26 and 2.62 average points on Qwen3-8B-Base and Qwen3-14B-Base respectively, with additional gains on code generation tasks.

5arXiv · cs.CL·2d ago·source ↗

RePro: Retrospective Progress-Aware Self-Refinement for LLM Agent Training

Researchers introduce RePro (Retrospective Progress-Aware Training), a framework addressing the gap between step-wise RL optimization and metacognitive task-progress awareness in LLM agents. The approach uses a forward-then-reflect rollout paradigm where agents execute actions online and then retrospectively assess step-wise progress given the completed trajectory and known outcome. Evaluated on WebShop, ALFWorld, and Sokoban, RePro achieves up to 12% absolute success rate gains over baseline Qwen-family models without requiring continuous external supervision.

5arXiv · cs.AI·6d ago·source ↗

Reroute: Training-free recoverable visual token routing for vision-language models

A new arXiv preprint proposes Reroute, a training-free plug-in that replaces the standard rank-and-remove visual token pruning paradigm in VLMs with a recoverable routing mechanism. Instead of permanently discarding low-ranked tokens, Reroute defers them to re-enter the candidate pool at later decoder stages, addressing the problem that token importance shifts across decoder depth. Evaluated on LLaVA-1.5 and Qwen backbones augmented with FastV, PDrop, and Nüwa pruning methods, Reroute improves grounding performance under aggressive token reduction without sacrificing general VQA accuracy. The approach preserves the theoretical compute and KV-cache budget of the underlying pruning method.

8Mistral Ai News·1mo ago·source ↗

Mistral Small 4: Unified Multimodal, Reasoning, and Coding MoE Model Released Under Apache 2.0

Mistral AI has released Mistral Small 4, a 119B-parameter Mixture-of-Experts model (6B active per token) that unifies capabilities previously split across Magistral (reasoning), Pixtral (multimodal), and Devstral (coding agents) into a single open-weights model. The model features a 256k context window, configurable reasoning effort via a `reasoning_effort` parameter, native text and image input support, and is released under Apache 2.0. Mistral claims 40% latency reduction and 3x throughput improvement over Mistral Small 3, with benchmark results showing competitive performance against GPT-OSS 120B and Qwen models while producing significantly shorter outputs. The release includes day-0 availability as an NVIDIA NIM and support across vLLM, llama.cpp, SGLang, and Transformers.

6arXiv · cs.CL·21d ago·source ↗

SAERL: Using Sparse Autoencoders to Guide LLM Reinforcement Learning Data Engineering

SAERL is a post-training data engineering framework that uses Sparse Autoencoders (SAEs) — a mechanistic interpretability tool — to extract intrinsic model signals for controlling data diversity, difficulty, and quality during RL fine-tuning. The framework applies SAE-space clustering for batch diversity, a difficulty proxy for curriculum ordering, and a quality probe for data filtering. On Qwen2.5-Math-1.5B with GRPO, SAERL achieves 3% average accuracy improvement and reaches target accuracy with 20% fewer training steps. SAE representations transfer across model families and scales, suggesting broad applicability as a lightweight data engineering tool.

5arXiv · cs.CL·13d ago·source ↗

Knowledge editing via locate-then-edit transferred to masked diffusion language models, revealing multi-token failure mode

A new arXiv paper investigates whether locate-then-edit knowledge editing methods, developed for autoregressive models, transfer to masked diffusion language models (MDMs) such as LLaDA and Dream. The authors find that causal tracing identifies the same early-to-mid-layer MLP location in both paradigms, but MDMs degrade systematically on multi-token edits due to partially unmasked intermediate states that the edit was never optimized for. A correction targeting these intermediate states substantially restores multi-token editing performance. The work is the first systematic comparison of knowledge editing across autoregressive and diffusion-based language model paradigms.

6arXiv · cs.CL·1h ago·source ↗

RubricsTree: Scalable hierarchical rubric framework for evaluating personal health AI agents

RubricsTree is a new evaluation framework for LLM-powered personal health agents, built around a hierarchical taxonomy of over 100 clinically-verifiable Boolean rubrics derived from 4,000 real user queries and curated with physician oversight. A context-aware router activates only relevant rubrics per query, enabling scalable yet expert-aligned evaluation. The framework outperforms strong LLM-as-a-judge baselines on expert alignment and, when used as training signal, yields up to ~66% relative gains on HealthBench across Gemini, GPT, and Qwen model families. The work addresses a concrete bottleneck in clinical deployment of health AI: the cost-quality tradeoff in evaluation.

4Github Trending·2d ago·source ↗

Open Interpreter: lightweight coding agent for open models (Deepseek, Kimi, Qwen)

Open Interpreter is an open-source Python coding agent framework supporting open-weight models including Deepseek, Kimi, and Qwen. The project has accumulated nearly 64,000 GitHub stars, with 45 new stars on the trending day. It provides a lightweight harness for running code-executing agents on locally-hosted or open models.

5arXiv · cs.AI·13d ago·source ↗

GeM-NR: Training-free multi-view editing for nonrigid 3D scene changes

GeM-NR is a training-free method for multi-view consistent image editing that handles nonrigid edits — changes that substantially alter scene geometry and appearance — a capability that existing methods largely lack. Given an anchor image edited by a backbone model (FLUX, Qwen, or BrushNet) and an unedited query image, the method propagates the edit consistently across viewpoints via depth estimation, point-cloud alignment, projection, and conditioned refinement. The authors report state-of-the-art performance on edit quality and geometric/photometric consistency across multiple views, including generation of 3D representations of edited scenes.