Almanac
company

Moonshot AI

companyactivemoonshot-ai-7cf86153·10 events·first seen 1mo ago

Aliases: Moonshot AI, Moonshot

Co-occurring entities

More like this (12)

Recent events (10)

6The Batch·15d ago·source ↗

Kimi K2.6: Moonshot AI's 1T-Parameter Vision-Language Model Matches Open-Weights Peers, Trails Top Closed Models

Moonshot AI released Kimi K2.6, a 1 trillion-parameter mixture-of-experts vision-language model with 32B active parameters, designed for long-horizon autonomous coding sessions lasting multiple days and multi-agent orchestration scaling to 300 parallel subagents executing up to 4,000 steps. The model matches Qwen3.6 Max Preview and DeepSeek-V4-Pro on the Artificial Analysis Intelligence Index (scoring 54 vs. their 52) while trailing closed models like GPT-5.5 and Claude Opus 4.7. Weights are freely downloadable from Hugging Face under a modified MIT license permitting commercial use, with API access priced at $0.95/$0.16/$4.00 per million input/cached/output tokens. Notable features include a 256K token context window, native INT4 quantization, a 'preserve thinking' mode for multi-turn reasoning continuity, and a research preview 'claw groups' feature enabling cross-developer agent collaboration.

9Anthropic News·15d ago·source ↗

Anthropic Identifies Industrial-Scale Distillation Attacks by DeepSeek, Moonshot, and MiniMax

Anthropic has publicly identified three Chinese AI laboratories—DeepSeek, Moonshot AI, and MiniMax—as conducting coordinated, large-scale distillation attacks against Claude, generating over 16 million exchanges through approximately 24,000 fraudulent accounts in violation of terms of service. The campaigns targeted Claude's most differentiated capabilities including agentic reasoning, tool use, coding, and chain-of-thought generation, with MiniMax alone responsible for over 13 million exchanges. Anthropic frames these attacks as a national security concern, arguing that illicitly distilled models strip out safety safeguards and undermine US export controls. The company claims high-confidence attribution via IP correlation, request metadata, and infrastructure indicators, in some cases corroborated by industry partners.

6The Batch·27d ago·source ↗

Data Points: Cursor Composer 2.5, Gemini 3.5 Flash, Antigravity 2.0, Omni Flash, AI Search, and Corti Symphony

This edition covers several notable AI product and model releases: Cursor shipped Composer 2.5 (built on Kimi K2.5) scoring 79.8% on SWE-Bench Multilingual at significantly lower cost than frontier competitors; Google released Gemini 3.5 Flash with claimed 4x speed advantage and launched Antigravity 2.0 as an agent-first desktop app replacing its IDE; Google also introduced Gemini Omni Flash for multimodal video generation and overhauled its search interface with Gemini 3.5. Additionally, Copenhagen-based Corti launched Symphony for Speech-to-Text achieving 1.4% word error rate on medical terminology versus 17-19% for generalist models.

6The Batch·14d ago·source ↗

MiniMax M2.7 proprietary reasoning model competes with Gemini and Claude Opus; roundup covers Cursor Composer 2, MAI-Image-2, Claude Code Channels, and Anthropic defense dispute

MiniMax released M2.7, a proprietary reasoning model that achieved 66.6% on MLE Bench Lite (tying Gemini 3.1) and 56.22% on SWE-Pro, priced at $0.30/$1.20 per million tokens, with the shift to proprietary marking a potential strategic pivot among Chinese AI labs away from open weights. Cursor released Composer 2, an agentic coding model built on a fine-tuned Kimi 2.5 (via Moonshot partnership), priced 86% cheaper than its predecessor and scoring 73.7 on SWE-bench Multilingual. Anthropic released Claude Code Channels, routing Telegram and Discord messages into local Claude Code sessions via MCP plugins, and separately filed a court response denying it has any backdoor or kill switch into military deployments of Claude. Microsoft announced MAI-Image-2, a text-to-image model ranking third on Arena.ai among research labs.

6The Batch·6d ago·source ↗

Data Points: Apple/Google Siri overhaul, Gemma 4 12B, Kimi Code CLI, OpenJarvis, and U.S. OpenAI stake talks

A multi-item digest covers several significant AI developments: Apple is expected to announce a revamped Siri at WWDC that uses Google Gemini models distilled for on-device use alongside cloud routing, marking a notable Apple-Google AI partnership. Google released Gemma 4 12B, an encoder-free multimodal open-weights model designed for consumer laptops under Apache 2.0. Moonshot AI released Kimi Code CLI, an open-source terminal coding agent with native subagent orchestration and conversational MCP configuration. Stanford and Lambda Labs released OpenJarvis, an on-device agent framework claiming near-cloud accuracy at 800× lower API cost. The White House and OpenAI are reportedly negotiating a government equity stake in OpenAI as part of a proposed Public Wealth Fund.

6Interconnects·1mo ago·source ↗

Latest open artifacts (#21): Open model bonanza — Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, GLM-5.1 & others

Interconnects' recurring open-weights roundup covers a dense cluster of recent releases including Gemma 4, DeepSeek V4, Kimi K2.6, MiMo 2.5, and GLM-5.1, characterizing the period as a flagship-after-flagship cadence. The piece also includes commentary on CAISI's assessment of DeepSeek V4. As a tier-2 commentary source, this is a synthesis and analysis layer rather than primary announcements.

7The Batch·15d ago·source ↗

GPT-5.5 Outperforms Benchmarks but Leads in Hallucination Rate; Kimi K2.6 Tops Open LLMs

GPT-5.5, OpenAI's latest closed vision-language model built for agentic coding and computer use, tops the Artificial Analysis Intelligence Index and ARC-AGI-2 benchmarks but exhibits a significantly higher hallucination rate (85.53%) compared to Claude Opus 4.7 (36.18%) and Gemini 3.1 Pro Preview (49.87%) on the AA-Omniscience benchmark. GPT-5.5 Pro processes reasoning tokens in parallel during inference, and pricing is roughly double GPT-5.4 rates. The model ranks lower on subjective Arena.ai leaderboards, where Claude Opus models dominate. The issue also notes Kimi K2.6 leading open-weight LLMs, though details on that item are truncated.

7The Batch·11d ago·source ↗

Gray market API proxy network enables discounted access to U.S. AI models in China via fraud and distillation

A ChinaTalk report details an informal ecosystem of API proxy servers, account farms, identity brokers, and token resellers that gives Chinese developers access to U.S. AI models like Claude, ChatGPT, and Gemini at steep discounts — sometimes 10% of market price — through methods ranging from terms-of-service violations to credit card fraud. CISPA Helmholtz Center research found proxy 'Gemini-2.5' access achieved only 37% on MedQA versus 83.82% via Google's official API, suggesting model substitution is common. The network also harvests API call logs as training data, feeding the industrial-scale distillation practices Anthropic accused DeepSeek, Moonshot, and MiniMax of in February. The White House acknowledged the distillation threat in an April memo, framing it as an adversarial national security concern.

6The Batch·4d ago·source ↗

Cursor's Composer 2.5 rivals GPT-5.5 and Claude Opus 4.7 on coding benchmarks at lower cost

Cursor released Composer 2.5, a specialized agentic coding model built on Moonshot's Kimi K2.5 open weights with additional pretraining and reinforcement learning fine-tuning tailored to Cursor's own CLI harness. The model ranks third on the Artificial Analysis Coding Agent Index behind Claude Opus 4.7 and GPT-5.5 at max reasoning, but significantly undercuts them on cost ($0.44 vs $4.14 per task) and speed (6.7 vs 17.7 minutes). The training approach—co-optimizing model and harness together using synthetic tasks, text feedback during RL, and 25x more synthetic data than Composer 2—illustrates a specialist model strategy that challenges the dominance of generalist frontier models in coding workflows.

7The Batch·14d ago·source ↗

Nvidia releases Nemotron 3 Super 120B-A12B open-weights model with hybrid Mamba-2/MoE architecture

Nvidia released Nemotron 3 Super 120B-A12B, an open-weights LLM with a hybrid Mamba-2/transformer/MoE architecture that activates only 12B parameters per token and supports up to 1 million token context. The model claims the fastest inference speed in its size class at 442 tokens/second and leads open-weights models on PinchBench agentic task evaluation, outperforming larger models including Kimi K2.5 (1T parameters). Nvidia is releasing weights, training data, and recipes under a permissive commercial license, and plans a $26B five-year investment in open-weights models — framed partly as a strategic response to Chinese labs building capable open-weights models on non-Nvidia hardware.