7Qwen Research (via RSSHub)·1mo ago

Qwen2.5-Omni: Alibaba Releases End-to-End Multimodal Model with Real-Time Streaming

Alibaba's Qwen team releases Qwen2.5-Omni, a 7B-parameter end-to-end multimodal model capable of processing text, images, audio, and video simultaneously. The model delivers real-time streaming responses in both text and natural speech synthesis. It is openly available on Hugging Face, ModelScope, DashScope, and GitHub, accompanied by a technical paper.

Frontier Model Releases Open Weights Progress Inference Economics Multimodal Progress Alibaba Qwen2.5-Omni Qwen Hugging Face ModelScope DashScope

Related guides (4)

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner In-depth

Frontier Model ReleasesTopic guide

Frontier Model Releases: The Race From Language to Action

Read asBeginner In-depth

Open Weights ProgressTopic guide

Open Weights Progress: How Freely Available AI Models Caught Up to the Frontier

Read asBeginner

Multimodal ProgressTopic guide

Multimodal Progress: How AI Learned to See, Hear, and Act

Read asBeginner

Related events (8)

6Qwen Research·1mo ago·source ↗

Qwen2-Audio: Multimodal Audio-Language Model Release

Alibaba's Qwen team releases Qwen2-Audio, the successor to Qwen-Audio, capable of accepting both audio and text inputs and generating text outputs. The model is positioned as a step toward AGI by extending large language model capabilities to audio modalities. It is released with accompanying paper, GitHub repository, and model weights on Hugging Face and ModelScope.

Frontier Model Releases Open Weights Progress Alibaba Qwen Hugging Face +3 more

6Qwen·15d ago·source ↗

Qwen releases Qwen3.5-2B multimodal model on Hugging Face

Alibaba's Qwen team released Qwen3.5-2B, a 2-billion-parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With nearly 2 million downloads, it has seen substantial community uptake.

Open Weights Progress Multimodal Progress Qwen3.5-2B-Base Microsoft Azure Qwen +1 more

5Qwen·15d ago·source ↗

Qwen releases Qwen3.5-0.8B multimodal model on Hugging Face

Alibaba's Qwen team released Qwen3.5-0.8B, a small-scale image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 2.7 million downloads and 562 likes, it has seen substantial community uptake for a sub-1B parameter multimodal model.

Open Weights Progress Multimodal Progress Qwen3.5-0.8B Microsoft Azure Qwen +1 more

7Qwen·15d ago·source ↗

Qwen releases Qwen3.5-27B multimodal model on Hugging Face

Qwen has released Qwen3.5-27B, a 27-billion parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With nearly 3 million downloads and 981 likes, it has seen substantial community uptake.

Frontier Model Releases Open Weights Progress Qwen3.6-27B Qwen Hugging Face +1 more

6Qwen·15d ago·source ↗

Qwen releases Qwen3.5-9B multimodal model on Hugging Face

Qwen has released Qwen3.5-9B, a 9-billion parameter image-text-to-text model, on Hugging Face. The model supports conversational use cases and is compatible with Azure deployment endpoints. With over 9 million downloads and 1,500+ likes, it has seen substantial community uptake.

Frontier Model Releases Open Weights Progress Microsoft Azure Qwen3-4B Qwen +2 more

6Qwen·15d ago·source ↗

Qwen releases Qwen3.5-4B multimodal model on Hugging Face

Qwen has released Qwen3.5-4B, a 4-billion parameter image-text-to-text model, on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. With over 10 million downloads and 604 likes, it has seen substantial community uptake.

Open Weights Progress Multimodal Progress Microsoft Azure Qwen3-4B Qwen +1 more

7Qwen·15d ago·source ↗

Qwen releases Qwen3.5-122B-A10B multimodal MoE model on Hugging Face

Qwen has released Qwen3.5-122B-A10B, a 122B-parameter mixture-of-experts image-text-to-text model with 10B active parameters, published on Hugging Face. The model supports conversational use and is compatible with Azure deployment endpoints. High download counts (840K) and likes (564) suggest rapid community uptake shortly after release.

Frontier Model Releases Open Weights Progress Microsoft Azure Qwen Qwen3.5-122B-A10B +2 more

7Qwen Research·1mo ago·source ↗

Qwen2-VL: Alibaba Releases Latest Vision-Language Model with Extended Video Understanding

Alibaba's Qwen team has released Qwen2-VL, the latest iteration of their vision-language model series built on the Qwen2 foundation. The model claims state-of-the-art performance on visual understanding benchmarks including MathVista, DocVQA, RealWorldQA, and MTVQA. A notable capability is understanding videos exceeding 20 minutes in length for question answering, dialog, and content creation tasks.

Frontier Model Releases Evaluation and Benchmarking Qwen2.5-VL RealWorldQA DocVQA +6 more