Qwen-Max-0428: Alibaba's Largest Instruction-Tuned Model Released
Alibaba's Qwen team has released Qwen-Max-0428, a new instruction-tuned model larger than the previously open-sourced Qwen1.5-110B-Chat. The model has entered Chatbot Arena and reached the top-10 on the leaderboard, while also outperforming Qwen1.5-110B-Chat on MT-Bench. The model is available via API, though it does not appear to be open-weights at this stage.
Related guides (3)
Related events (8)
Qwen1.5-110B: Alibaba Releases First 100B+ Model in Qwen1.5 Series
Alibaba's Qwen team released Qwen1.5-110B, their first open-weights model exceeding 100 billion parameters. The model claims comparable performance to Meta's Llama-3-70B on base model benchmarks, with strong results on MT-Bench and AlpacaEval 2 chat evaluations. The release follows a wave of large open-source models exceeding 100B parameters from various organizations.
Alibaba's Qwen3.7-Max positions as top Chinese LLM with closed weights and agentic focus
Alibaba released Qwen3.7-Max, a closed-weights proprietary model targeting long-running agentic tasks like coding and scientific discovery, with a 1M-token context window and 208 tokens/second output speed. The model ranks fifth to seventh on the Artificial Analysis Intelligence Index, trailing leading U.S. models from OpenAI, Anthropic, and Google but claiming the lowest hallucination rate among frontier models tested—partly by declining to answer over half of prompts. Alibaba's training approach separates task, agentic harness, and verifier components to prevent overfitting to specific setups. The release continues Alibaba's strategic shift from open to closed weights for top-tier models, with leadership changes in the Qwen team suggesting a revenue-focused pivot.
Alibaba releases Qwen3.5 open-weights vision-language model family with MoE architecture across eight sizes
Alibaba released the Qwen3.5 family of eight open-weights vision-language models ranging from 0.8B to 397B parameters, built on a mixture-of-experts architecture with mixed attention and Gated DeltaNet layers. The flagship Qwen3.5-397B-A17B outperforms GPT-5.2, Claude 4.5 Opus, and Gemini-3 Pro on 28 of 44 vision benchmarks, while the 9B model surpasses OpenAI's gpt-oss-120B on most language tasks. Open weights are available under Apache 2.0, with hosted agentic variants (Qwen3.5-Plus, Qwen3.5-Flash) available via Alibaba Cloud. The release is notable for strong small-model efficiency and comes amid reported team departures following the Qwen3 rollout.
Qwen1.5-32B: Alibaba's 30B-Parameter Capstone for the Qwen1.5 Series
Alibaba's Qwen team released Qwen1.5-32B, a ~30 billion parameter open-weights language model positioned as the capstone of the Qwen1.5 series. The model targets the emerging consensus around 30B parameters as an optimal balance between performance, memory footprint, and inference efficiency. It is released alongside code on GitHub, weights on HuggingFace and ModelScope, and an interactive demo.
Qwen2 Model Family Released: Five Sizes, 128K Context, Multilingual
Alibaba's Qwen team has released Qwen2, an evolution from Qwen1.5, comprising five pretrained and instruction-tuned models ranging from 0.5B to 72B parameters, including a 57B mixture-of-experts variant (57B-A14B). The release highlights training on 27 additional languages beyond English and Chinese, significantly improved coding and mathematics performance, and extended context support up to 128K tokens for the 7B and 72B instruct variants. Benchmark results are claimed to be state-of-the-art across a large number of evaluations.
Qwen2.5-Max: Large-Scale MoE Model Release by Alibaba's Qwen Team
Alibaba's Qwen team announces Qwen2.5-Max, a large-scale Mixture-of-Experts language model. The post acknowledges that scaling insights for very large MoE models have been limited, citing DeepSeek V3's recent disclosures as a reference point. The model is positioned as a frontier-scale MoE system developed concurrently with ongoing Qwen2 research.
Introducing Qwen1.5: Open-Source Models Across Eight Sizes Including MoE
Alibaba's Qwen team released Qwen1.5, open-sourcing both base and chat models in eight sizes ranging from 0.5B to 110B parameters, plus a Mixture-of-Experts (MoE) variant. The release emphasizes developer experience improvements alongside model quality. Models are available on GitHub, Hugging Face, and ModelScope.
Qwen3 Release: Flagship 235B MoE and Full Model Family Announced
Alibaba's Qwen team has released Qwen3, a new family of large language models including the flagship Qwen3-235B-A22B mixture-of-experts model. The flagship model claims competitive benchmark performance against DeepSeek-R1, OpenAI o1/o3-mini, Grok-3, and Gemini-2.5-Pro on coding, math, and general capabilities. A smaller MoE variant, Qwen3-30B-A3B, reportedly outperforms QwQ-32B despite using only one-tenth the activated parameters, and the 4B model is said to match Qwen2.5's larger models. Models are available across Hugging Face, ModelScope, and Kaggle.


