Almanac
← Events
7Qwen Research (via RSSHub)·1mo ago

QVQ-Max: Alibaba Qwen Releases Visual Reasoning Model with Multimodal Chain-of-Thought

Alibaba's Qwen team has officially released QVQ-Max, a visual reasoning model succeeding the December 2024 QVQ-72B-Preview. The model is designed to analyze and reason over images and videos, covering domains including mathematics, programming, and creative tasks. It represents a step beyond the exploratory preview, positioning as a production-grade multimodal reasoning system.

Related guides (3)

Related events (8)

7Qwen Research·1mo ago·source ↗

QVQ-72B-Preview: Qwen Visual Reasoning Model Release

Alibaba's Qwen team has released QVQ-72B-Preview, a 72-billion parameter multimodal model designed to integrate visual understanding with advanced reasoning capabilities. The model is positioned as an extension of Qwen's language reasoning work into the visual domain. It is available on GitHub, Hugging Face, ModelScope, and Kaggle with a live demo.

7Qwen Research·1mo ago·source ↗

QwQ-32B-Preview: Alibaba's Qwen Reasoning Model with Deep Reflection Capabilities

Alibaba's Qwen team has released QwQ-32B-Preview, a 32-billion parameter model designed for deep reasoning across mathematics, code, and general knowledge. The model is positioned as a reasoning-focused system that emphasizes uncertainty and iterative questioning as core design principles. It is available on GitHub, Hugging Face, ModelScope, and via a demo interface.

7Qwen Research·1mo ago·source ↗

Qwen2-VL: Alibaba Releases Latest Vision-Language Model with Extended Video Understanding

Alibaba's Qwen team has released Qwen2-VL, the latest iteration of their vision-language model series built on the Qwen2 foundation. The model claims state-of-the-art performance on visual understanding benchmarks including MathVista, DocVQA, RealWorldQA, and MTVQA. A notable capability is understanding videos exceeding 20 minutes in length for question answering, dialog, and content creation tasks.

6Qwen Research·1mo ago·source ↗

QwQ-Max-Preview Released by Qwen Team

Alibaba's Qwen team has released QwQ-Max-Preview, a preview version of their reasoning-focused model built on top of Qwen2.5-Max. The post is itself generated by the model, serving as a demonstration of its capabilities. As a preview release, it signals an upcoming full model launch in the Qwen series.

6Qwen Research·1mo ago·source ↗

Introducing Qwen-VL-Plus and Qwen-VL-Max: Upgraded Multimodal Models from Alibaba

Alibaba's Qwen team has launched two enhanced versions of their multimodal model, Qwen-VL-Plus and Qwen-VL-Max, building on the open-sourced Qwen-VL released in September 2023. Key improvements include substantially boosted image reasoning capabilities, enhanced detail recognition and text extraction from images, and support for high-definition images exceeding one million pixels across various aspect ratios. The upgrades represent a significant step forward in the Qwen-VL series' generalization and visual understanding capabilities.

7Qwen Research·1mo ago·source ↗

Qwen VLo: Unified Multimodal Understanding and Generation Model

Alibaba's Qwen team has announced Qwen VLo, a new model that unifies multimodal understanding and image generation in a single architecture. Building on the Qwen2.5 VL lineage, the model is positioned to both comprehend and generate high-quality visual content. This represents a step toward unified perception-and-creation models, a direction several frontier labs are pursuing simultaneously.

7Qwen Research·1mo ago·source ↗

QwQ-32B: Scaling Reinforcement Learning for Enhanced Reasoning

Alibaba's Qwen team releases QwQ-32B, a 32-billion parameter model trained with scaled Reinforcement Learning to improve reasoning capabilities beyond conventional pretraining and post-training methods. The release draws explicit comparison to DeepSeek R1's cold-start and multi-stage RL training approach. The model is available via Qwen Chat, Hugging Face, ModelScope, and a demo interface. This represents Qwen's exploration of RL scalability as a path to enhanced LLM intelligence.

8Qwen Research·1mo ago·source ↗

Qwen2.5-VL: Alibaba's New Flagship Vision-Language Model Released in 3B/7B/72B Sizes

Alibaba's Qwen team has released Qwen2.5-VL, their new flagship vision-language model, representing a significant upgrade over Qwen2-VL. The release includes both base and instruct variants in three sizes (3B, 7B, 72B), all open-weighted and available on Hugging Face and ModelScope. The 72B instruct model is also accessible via Qwen Chat. Key capabilities highlighted include enhanced visual understanding, with the model positioned as a major step forward in multimodal performance.