Qwen-AgentWorld: Language world models for general agent simulation and planning
Alibaba's Qwen team introduces Qwen-AgentWorld, a pair of language world models (35B-A3B and 397B-A17B) trained to simulate agentic environments across 7 domains using over 10M interaction trajectories. The models are trained via a three-stage pipeline (CPT, SFT, RL) and evaluated on AgentWorldBench, a new benchmark constructed from 5 frontier models across 9 established benchmarks. Beyond simulation, the work demonstrates two downstream use cases: using the world model as a decoupled RL training environment and as a warm-up for agent foundation models, both yielding gains over baselines.
Related guides (3)
Related events (8)
Qwen releases AgentWorld-35B-A3B: a world-model and environment-simulation MoE for agents
Qwen has released Qwen-AgentWorld-35B-A3B on Hugging Face, a 35B-parameter MoE model (3B active) built on the Qwen3.5 MoE architecture. The model is tagged for world-model and environment-simulation use cases, suggesting it is designed to simulate environments for agent training or evaluation. It is paired with a dataset called AgentWorldBench, indicating an associated evaluation suite. Early engagement is minimal (0 downloads, 4 likes) but the model represents a notable direction in agent-environment modeling from a major open-weights lab.
Qwen3.7-Max: The Agent Frontier
Alibaba's Qwen team has announced Qwen3.7-Max, positioned as a frontier model for agentic tasks. The announcement appears on the official Qwen blog and generated significant community discussion on Hacker News with 559 points and 217 comments. The model name suggests it is part of the Qwen 3 generation, with a focus on agent capabilities.
Qwen2 Model Family Released: Five Sizes, 128K Context, Multilingual
Alibaba's Qwen team has released Qwen2, an evolution from Qwen1.5, comprising five pretrained and instruction-tuned models ranging from 0.5B to 72B parameters, including a 57B mixture-of-experts variant (57B-A14B). The release highlights training on 27 additional languages beyond English and Chinese, significantly improved coding and mathematics performance, and extended context support up to 128K tokens for the 7B and 72B instruct variants. Benchmark results are claimed to be state-of-the-art across a large number of evaluations.
Qwen2.5-LLM: Alibaba releases open-weight language models from 0.5B to 72B
Alibaba's Qwen team releases the Qwen2.5 series of decoder-only dense language models, open-sourcing seven variants spanning 0.5B to 72B parameters. The release targets production use cases in the 10-30B range and mobile deployments at 3B scale. This represents a significant expansion of the open-weights frontier from a Tier 1 Chinese AI lab.
Generalizing an LLM from 8k to 1M Context using Qwen-Agent
Alibaba's Qwen team describes an agent built on Qwen2 (8k native context) that processes documents up to 1M tokens by decomposing retrieval and reasoning tasks, reportedly outperforming both RAG pipelines and native long-context models. The agent framework was also used to generate synthetic training data for fine-tuning new long-context Qwen models, creating a self-improvement loop. This positions agent-based context extension as a practical alternative to architectural long-context training.
Qwen-VLA: Unified Vision-Language-Action Model Across Robot Tasks, Environments, and Embodiments
Alibaba's Qwen team presents Qwen-VLA, a unified embodied foundation model that extends the Qwen vision-language stack to continuous action and trajectory generation via a DiT-based action decoder. The model is jointly pretrained on diverse data spanning manipulation trajectories, egocentric demonstrations, synthetic simulation, and navigation data, with embodiment-aware prompt conditioning to support multiple robot platforms. A unified action-and-trajectory prediction framework covers manipulation, navigation, and trajectory prediction tasks. Benchmarks show strong results: 97.9% on LIBERO, 73.7% on Simpler-WidowX, 69.0% OSR on R2R navigation, and 76.9% average OOD success in real-world ALOHA experiments.
Qwen2.5-Max: Large-Scale MoE Model Release by Alibaba's Qwen Team
Alibaba's Qwen team announces Qwen2.5-Max, a large-scale Mixture-of-Experts language model. The post acknowledges that scaling insights for very large MoE models have been limited, citing DeepSeek V3's recent disclosures as a reference point. The model is positioned as a frontier-scale MoE system developed concurrently with ongoing Qwen2 research.
Qwen2.5-VL-32B: Reinforcement-Learning-Optimized Vision-Language Model Released
Alibaba's Qwen team has released Qwen2.5-VL-32B-Instruct, a 32-billion-parameter vision-language model built on the Qwen2.5-VL series and further optimized with reinforcement learning. The model is open-sourced under the Apache 2.0 license and available on Hugging Face and ModelScope. It follows the January 2025 launch of the broader Qwen2.5-VL series, positioning the 32B scale as a balance between capability and deployment practicality.


