7The Batch (DeepLearning.AI)·3d ago

Z.ai releases GLM-5.2, a 753B MoE open-weights model claiming top open-model ranking on agentic coding benchmarks

Z.ai released GLM-5.2, a 753-billion-parameter mixture-of-experts open-weights model optimized for long-running agentic coding tasks, with a 1-million-token input context and MIT license. The model ranks first among open-weights models on Artificial Analysis's Intelligence Index v4.1 (score 51, behind Claude Opus 4.8 at 56 and GPT-5.5 at 55) and leads all models on PostTrainBench, a benchmark for agentic fine-tuning tasks. Key technical contributions include a modified sparse attention indexer applied every four layers (cutting per-token computation 2.9x at 1M context), a switch from GRPO to PPO for long-horizon RL training, and a reward-hacking mitigation pipeline using rule-based filters and a judge model. API pricing is substantially below comparable proprietary models, and the release coincides with U.S. government restrictions on access to Anthropic's frontier models.

Open Weights Progress Inference Economics Agent and Tool Ecosystem Artificial Analysis Intelligence Index AA-Briefcase DeepSeek V4 IndexCache Claude Fable 5 Proximal Policy Optimization Hugging Face Arena.ai Code Arena WebDev GRPO (Group Relative Policy Optimization)OpenAI GLM-5.1 Z.ai PostTrainBench GPT-5.5 Claude Opus 4.8 Anthropic

Related guides (5)

Anthropic

Anthropic: Frontier AI Lab at the Intersection of Capability and Safety Governance

Read asIn-depth

GPT-5.5

GPT-5.5: OpenAI's Most Capable Model — and Its Most Complicated

Read asBeginner In-depth

DeepSeek V4

DeepSeek V4: The Open-Weights Giant Reshaping AI Economics

Read asBeginner

Hugging Face

Hugging Face: The Home of Open-Source AI

Read asBeginner

OpenAI

OpenAI: The Lab That Made AI a Household Word

Read asBeginner

Related events (8)

7The Batch·28d ago·source ↗

Z.ai's GLM-5.1 Open-Weights Model Targets Multi-Hour Agentic Coding Tasks with Iterative Self-Evaluation

Z.ai released GLM-5.1, a 754B parameter mixture-of-experts open-weights model optimized for long-running agentic coding tasks, capable of cycling through planning, execution, and strategy revision hundreds of times over sessions lasting up to eight hours. The model achieves top open-weights scores on the Artificial Analysis Intelligence Index and third place on Arena's Code leaderboard, while leading SWE-Bench Pro in Z.ai's own evaluations at 58.4 percent. Weights are available on HuggingFace under MIT license, with API pricing roughly 40 percent higher than its predecessor but still below comparable proprietary models. No technical report has been published, leaving architecture and training details undisclosed.

Frontier Model Releases Evaluation and Benchmarking Gemini 3.1 Pro Artificial Analysis Intelligence Index Claude Opus 4.6 +14 more

6The Batch·28d ago·source ↗

GLM-5.1 Open-Weights Model Targets Long-Running Agentic Tasks; Andrew Ng on Coding Agent Acceleration by Software Domain

Z.ai released GLM-5.1, an open-weights mixture-of-experts LLM (754B total / 40B active parameters) designed for sustained agentic coding tasks lasting up to eight hours, featuring iterative planning-execution-evaluation loops with thousands of tool calls. The model claims top open-weights performance on Artificial Analysis Intelligence Index and SWE-Bench Pro, available under MIT license via HuggingFace. The accompanying editorial by Andrew Ng offers a tiered framework for how much coding agents accelerate different software work categories—frontend most, then backend, infrastructure, and research least—with practical implications for team organization. A secondary item references data-center opposition and LLM helpfulness failure modes.

Frontier Model Releases Evaluation and Benchmarking DeepLearning.AI Artificial Analysis Intelligence Index SWE-bench +9 more

5Don'T Worry About The Vase·7d ago·source ↗

Zvi Mowshowitz commentary: GLM-5.2 as new best open model

Zvi Mowshowitz covers the release of GLM-5.2, characterizing it as the new best open model. The post is a tier-2 commentary piece on what appears to be a significant open-weights model release. The body is truncated, so specific benchmark claims or technical details are not available from this excerpt.

Frontier Model Releases Open Weights Progress Zvi Mowshowitz GLM-5.1

5Latent Space·10d ago·source ↗

GLM-5.2 passes community vibe checks; Z.ai forecasts Open Fable by December

GLM-5.2, a new open model, is reportedly passing community vibe checks and drawing comparisons to GPT-class frontier models. Z.ai has forecast the release of Open Fable by December. The item signals a potential shift in the open-weights landscape toward genuine frontier-level capability.

Frontier Model Releases Open Weights Progress Open Fable GLM-5.1 Z.ai

7The Batch·12d ago·source ↗

Data Points: GLM-5.2 leads open models on coding benchmarks; SpaceX acquires Cursor; OpenRouter Fusion; Anthropic coding study; ChatGPT market share drops

Zhipu released GLM-5.2, a 744B-parameter open model under MIT license that ranks second only to Claude Opus 4.8 on long-horizon coding benchmarks including FrontierSWE and SWE-Marathon, featuring a 1M-token context window and a 2.9× compute reduction via IndexShare attention. SpaceX is acquiring Cursor (Anysphere) for $60B in stock, positioning Musk's company to compete in AI software tools using xAI's Colossus infrastructure. OpenRouter launched Fusion, a multi-model synthesis tool showing that budget model panels can match frontier model performance at half the cost. An Anthropic study of 400K Claude Code sessions found domain expertise—not coding skill—is the primary driver of agentic output, while a Munich court ruled Google liable for false claims in AI Overviews.

Frontier Model Releases Evaluation and Benchmarking DRACO FrontierSWE Anysphere +24 more

6Latent Space·12d ago·source ↗

GLM-5.2 claims top frontend coding performance; IndexShare speculative decoding introduced

A Latent Space AI news digest highlights GLM-5.2 as a new open-weights model claiming top performance on frontend coding tasks. The digest also covers IndexShare, a technique for speculative decoding. The body is truncated but the headline signals a notable open-weights model release and an inference optimization development.

Evaluation and Benchmarking Open Weights Progress IndexShare GLM-5.1 Latent Space +1 more

5Hugging Face Blog·12d ago·source ↗

GLM-5.2 announced as model built for long-horizon tasks

ZAI.org published a blog post on Hugging Face announcing GLM-5.2, a model positioned for long-horizon tasks. The post appears to be a model release announcement from the GLM (General Language Model) lineage. Limited body content is available, but the framing suggests capabilities relevant to extended reasoning or agentic workflows.

Long Context Evolution Frontier Model Releases zai-org Hugging Face GLM-5.1

5The Batch·3d ago·source ↗

The Batch Issue 359: Loop Engineering for Agentic Coding, GLM-5.2 Open-Weights Release, Apple On-Device Models

Andrew Ng's weekly letter introduces a framework of three nested loops for agentic software development (engineering loop, developer feedback loop, external feedback loop), contextualizing the 'loop engineering' trend popularized by Claude Code and OpenClaw creators. The issue also covers Z.ai's GLM-5.2, a 753B MoE open-weights model with 1M token context that claims first place among open models on Artificial Analysis Intelligence Index v4.1 and leads all models on PostTrainBench for long-running agentic tasks. Additional coverage includes Apple's recipe for on-device models and AI education trends.

Frontier Model Releases Evaluation and Benchmarking DeepLearning.AI Artificial Analysis Intelligence Index Boris Cherny +8 more