Almanac
← Events
4Hacker News (AI-filtered, score >= 200)·7d ago

HN community discussion: GLM 5.2 vs. Claude Opus comparison

A Hacker News thread with 347 points and 244 comments compares GLM 5.2 against Claude Opus. The high engagement suggests active community interest in how a Chinese open-weights frontier model stacks up against Anthropic's flagship. No body content is available beyond the title and engagement metrics.

Related guides (3)

Related events (8)

5Don'T Worry About The Vase·7d ago·source ↗

Zvi Mowshowitz commentary: GLM-5.2 as new best open model

Zvi Mowshowitz covers the release of GLM-5.2, characterizing it as the new best open model. The post is a tier-2 commentary piece on what appears to be a significant open-weights model release. The body is truncated, so specific benchmark claims or technical details are not available from this excerpt.

6Simon Willison'S Weblog·11d ago·source ↗

Simon Willison: GLM-5.2 is probably the most powerful text-only open weights LLM

Simon Willison asserts that GLM-5.2 is likely the most capable text-only open-weights language model currently available. The post is a commentary from a respected practitioner tracking the open-weights landscape. This is notable as a signal about the state of open-weights competition relative to closed frontier models.

5Hacker News·32h ago·source ↗

Semgrep: GLM 5.2 outperforms Claude on cybersecurity benchmarks

Semgrep published a blog post reporting that GLM 5.2 beats Claude on their internal cybersecurity benchmarks, framed as a 'we have Mythos at home' comparison. The post appears to evaluate models on cyber-specific tasks relevant to Semgrep's code security tooling. This is a practitioner-level benchmark comparison from a security-focused company, providing real-world signal on model performance in a specialized domain.

7The Batch·3d ago·source ↗

Z.ai releases GLM-5.2, a 753B MoE open-weights model claiming top open-model ranking on agentic coding benchmarks

Z.ai released GLM-5.2, a 753-billion-parameter mixture-of-experts open-weights model optimized for long-running agentic coding tasks, with a 1-million-token input context and MIT license. The model ranks first among open-weights models on Artificial Analysis's Intelligence Index v4.1 (score 51, behind Claude Opus 4.8 at 56 and GPT-5.5 at 55) and leads all models on PostTrainBench, a benchmark for agentic fine-tuning tasks. Key technical contributions include a modified sparse attention indexer applied every four layers (cutting per-token computation 2.9x at 1M context), a switch from GRPO to PPO for long-horizon RL training, and a reward-hacking mitigation pipeline using rule-based filters and a judge model. API pricing is substantially below comparable proprietary models, and the release coincides with U.S. government restrictions on access to Anthropic's frontier models.

5Latent Space·10d ago·source ↗

GLM-5.2 passes community vibe checks; Z.ai forecasts Open Fable by December

GLM-5.2, a new open model, is reportedly passing community vibe checks and drawing comparisons to GPT-class frontier models. Z.ai has forecast the release of Open Fable by December. The item signals a potential shift in the open-weights landscape toward genuine frontier-level capability.

7The Batch·12d ago·source ↗

Data Points: GLM-5.2 leads open models on coding benchmarks; SpaceX acquires Cursor; OpenRouter Fusion; Anthropic coding study; ChatGPT market share drops

Zhipu released GLM-5.2, a 744B-parameter open model under MIT license that ranks second only to Claude Opus 4.8 on long-horizon coding benchmarks including FrontierSWE and SWE-Marathon, featuring a 1M-token context window and a 2.9× compute reduction via IndexShare attention. SpaceX is acquiring Cursor (Anysphere) for $60B in stock, positioning Musk's company to compete in AI software tools using xAI's Colossus infrastructure. OpenRouter launched Fusion, a multi-model synthesis tool showing that budget model panels can match frontier model performance at half the cost. An Anthropic study of 400K Claude Code sessions found domain expertise—not coding skill—is the primary driver of agentic output, while a Munich court ruled Google liable for false claims in AI Overviews.

5Don'T Worry About The Vase·27d ago·source ↗

Zvi Mowshowitz analyzes Claude Opus 4.8 capabilities and community reactions

Zvi Mowshowitz (Don't Worry About the Vase) publishes a roundup and analysis of Claude Opus 4.8, aggregating capability observations and community reactions to the new model. The post synthesizes multiple data points to characterize the model's strengths and weaknesses. This is a secondary commentary piece following what appears to be a recent Anthropic model release.

6The Batch·28d ago·source ↗

GLM-5.1 Open-Weights Model Targets Long-Running Agentic Tasks; Andrew Ng on Coding Agent Acceleration by Software Domain

Z.ai released GLM-5.1, an open-weights mixture-of-experts LLM (754B total / 40B active parameters) designed for sustained agentic coding tasks lasting up to eight hours, featuring iterative planning-execution-evaluation loops with thousands of tool calls. The model claims top open-weights performance on Artificial Analysis Intelligence Index and SWE-Bench Pro, available under MIT license via HuggingFace. The accompanying editorial by Andrew Ng offers a tiered framework for how much coding agents accelerate different software work categories—frontend most, then backend, infrastructure, and research least—with practical implications for team organization. A secondary item references data-center opposition and LLM helpfulness failure modes.