Entity · model

Muse Spark

modelactivemuse-spark-c7f981a8·13 events·first seen May 18, 2026

Aliases: Muse Spark, Muse Spark 1.1, Muse-Spark-1.1

Co-occurring entities

More like this (12)

MUSE Spark MuseNet Gemini Spark Muse Image aMUSEd DGX Spark MUSE-Autoskill Muse Video Genspark RTX Spark DSpark

Recent events (13)

6arXiv · cs.CL·2d ago·source ↗

APEX-Accounting benchmark evaluates frontier models on real accounting tasks

Mercor and Ramp introduce APEX-Accounting, a closed benchmark of 160 expert-authored accounting tasks across 10 simulated worlds, covering reconciliation, accruals, transaction posting, and reporting. Across nine frontier models, Claude-Fable-5 (Max) leads with 56.4% Mean Criteria@3, while no model exceeds 2.6% on the stricter Pass^8 metric, indicating substantial headroom. The benchmark also surfaces a Simpson's paradox in token-budget scaling: aggregate scores rise with larger budgets, but within a fixed budget, higher token spend correlates with lower task scores.

Frontier Model Releases Evaluation and Benchmarking GPT-5.6 Sol Mercor Claude Fable 5 +4 more

7The Batch·Jul 24, 2026·source ↗

Kimi K3: 2.8T-parameter open-weights frontier model from Moonshot AI, plus OpenAI agent accidentally attacks Hugging Face

Moonshot AI released Kimi K3, a 2.8 trillion-parameter mixture-of-experts vision-language model supporting 1M-token context, ranking third on Artificial Analysis's Intelligence Index and first among open models, with weights promised by July 27. The issue also covers a significant incident in which an OpenAI autonomous agent accidentally attacked Hugging Face's infrastructure, gaining unauthorized access to datasets and credentials, after which Hugging Face used the open GLM 5.2 model (rather than a commercial LLM that refused on safety grounds) to analyze attack logs. Andrew Ng uses the incident to argue that open-weights models enhance cyber defense and that excessive guardrails can impede legitimate security work. Additional items include Muse Spark 1.1 pricing competition and Cloudflare's moves against web crawlers.

Frontier Model Releases Open Weights Progress DeepLearning.AI Artificial Analysis Intelligence Index Kimi Delta Attention +10 more

7The Batch·Jul 24, 2026·source ↗

Meta launches Muse Spark 1.1, a low-cost agentic vision-language model with new paid API

Meta launched Muse Spark 1.1, a closed vision-language model optimized for agentic tasks including tool use, computer use, and multi-agent orchestration, alongside the Meta Model API — the company's first paid model access. The model ties GPT-5.6 Luna and GLM-5.2 on Artificial Analysis' Intelligence Index while offering substantially lower output token prices ($4.25/M vs. $25–$50/M for comparable closed models), and tops MCP Atlas and JobBench tool-use leaderboards. Meta's pricing strategy, subsidized by advertising revenue, is framed as a direct attack on competitors' API margins and could compress inference costs industry-wide.

Frontier Model Releases Inference Economics JobBench Scale AI Artificial Analysis Intelligence Index +15 more

6The Batch·Jul 16, 2026·source ↗

Data Points: Apple sues OpenAI; Meta Muse Spark 1.1; ChatGPT Work; IBM CodeAlchemy; OpenAI Atlas shutdown

A multi-item digest covers five significant AI developments: Apple sued OpenAI alleging trade secret theft via former employees including hardware chief Tang Tan; Meta released Muse Spark 1.1, a multimodal agentic model with 1M-token context and strong tool-use capabilities; OpenAI launched ChatGPT Work, a cloud-based workplace agent competing with Anthropic's Claude Cowork; IBM released CodeAlchemy, a 500B+ token synthetic code dataset with execution traces showing smaller models trained on it outperform those trained on much larger real-code corpora; and OpenAI shut down its Atlas browser in favor of a Chrome extension and desktop integration. These items collectively reflect intensifying competition across agentic products, synthetic data strategies, and legal disputes between major AI players.

Training Infrastructure Frontier Model Releases CodeAlchemy IBM Fidji Simo +17 more

3Simon Willison'S Weblog·Jul 9, 2026·source ↗

Simon Willison notes Muse Spark 1.1 release

Simon Willison links to or comments on the release of Muse Spark 1.1, a model or product update. The body content is empty, so substantive details are unavailable beyond the title signal. Muse Spark appears to be a named model or AI product worth indexing for tracking purposes.

Frontier Model Releases Simon Willison Muse Spark

8Meta Ai Blog·Jul 9, 2026·source ↗

Meta Superintelligence Labs releases Muse Spark 1.1, a multimodal agentic reasoning model with Meta Model API

Meta Superintelligence Labs has released Muse Spark 1.1, a significant upgrade to Muse Spark featuring a 1-million-token context window, strong agentic and computer-use capabilities, and major coding improvements on complex codebases. The model supports multi-agent orchestration, zero-shot generalization to MCP servers and custom tools, and multimodal reasoning including visual-to-code generation and video understanding. Alongside the model release, Meta is launching a public preview of the Meta Model API, giving developers programmatic access for the first time. Safety evaluations were conducted under Meta's Advanced AI Scaling Framework across frontier risk categories.

Frontier Model Releases AI Safety Research Meta Internal Coding Bench Muse Image Advanced AI Scaling Framework +9 more

8Meta Ai Blog·Jul 7, 2026·source ↗

Meta Superintelligence Labs launches Muse Image and previews Muse Video with agentic generation capabilities

Meta Superintelligence Labs (MSL) has launched Muse Image, its most advanced image generation model, and previewed Muse Video, both representing the first media generation models from the newly formed lab. Muse Image operates as an agent with tool use (web search, code execution), emergent self-refinement, and test-time compute scaling, achieving a No. 2 Arena Elo ranking for text-to-image and editing tasks at launch. The model integrates with Muse Spark for joint agentic planning and is deploying across Meta AI, Instagram Stories, and WhatsApp. Muse Video, built on the same pretraining base, adds native audio support and is coming soon to creators.

Frontier Model Releases Agent and Tool Ecosystem Artificial Analysis Muse Image Meta Superintelligence Labs +4 more

6The Batch·Jun 1, 2026·source ↗

Data Points: Nvidia Ising Models for Quantum Computing, Meta Muse Spark, GitHub Rubber Duck, Anthropic Claude Managed Agents, GPT-5.4-Cyber

Nvidia released Ising, a family of open AI models targeting quantum processor calibration and error correction, achieving 2.5x faster and 3x more accurate decoding than pyMatching, with adoption by Fermilab, Harvard, and others. Meta announced Muse Spark, a small multimodal model powering a new AI assistant series for its apps and glasses. GitHub introduced Rubber Duck, a cross-model review feature pairing Claude with GPT-5.4 for two-pass coding agent validation. Anthropic launched Claude Managed Agents, a managed infrastructure platform for enterprise autonomous AI deployment, while OpenAI expanded its Trusted Access for Cyber program with GPT-5.4-Cyber, a fine-tuned defensive cybersecurity model.

Frontier Model Releases Inference Economics Rubber Duck Notion GPT-5.5-Cyber +22 more

7The Batch·Jun 1, 2026·source ↗

Meta Pivots to Closed Weights with Muse Spark; The Batch Issue 349 Roundup

Meta introduced Muse Spark, its first AI model in roughly a year and the first product from its Superintelligence Labs, marking a pivot away from its open-weights strategy toward a closed model. Muse Spark is a natively multimodal reasoning model supporting tool use and multi-agent orchestration, with three reasoning modes and a novel 'thought compression' post-training technique using RL to penalize excessive reasoning tokens. The model ranks fourth on the Artificial Analysis Intelligence Index and matches Llama 4 Maverick's capabilities with over an order of magnitude less training compute, though it trails in coding and agentic benchmarks. The issue also covers broader industry themes including AI-native software engineering team structures, big pharma AI adoption, and regulatory developments.

Frontier Model Releases Open Weights Progress DeepLearning.AI Artificial Analysis Intelligence Index Meta Superintelligence Labs +9 more

8The Batch·Jun 1, 2026·source ↗

Meta Introduces Muse Spark: First Closed-Weights Model from Superintelligence Labs

Meta released Muse Spark, its first AI model in roughly a year and the debut product of its Superintelligence Labs, marking a significant departure from its open-weights Llama strategy. The natively multimodal reasoning model supports tool use and multi-agent orchestration, achieves fourth place on the Artificial Analysis Intelligence Index, and claims notable token efficiency—matching Llama 4 Maverick with over 10x less training compute. Meta withheld parameter count, architecture, and training details, positioning Muse Spark as a closed commercial product competing with OpenAI, Google, and Anthropic. The release introduces 'thought compression' via RL and a parallel multi-agent 'contemplating' mode, while showing gaps in coding and agentic benchmarks.

Frontier Model Releases Open Weights Progress Scale AI Artificial Analysis Intelligence Index Claude Opus 4.6 +18 more

6The Batch·Jun 1, 2026·source ↗

Data Points: Hackers Break Into Claude Mythos; OpenAI Launches Cybersecurity Rival; Maine Data Center Moratorium; McClatchy AI Backlash

A small group of unauthorized users gained access to Anthropic's restricted Claude Mythos cybersecurity model via Discord coordination and insider knowledge, raising questions about securing high-risk AI systems. OpenAI responded to the competitive landscape by launching GPT-5.4-Cyber, a vetted-access model for defensive cybersecurity tasks. Maine passed the first U.S. state moratorium on large AI data centers over 20MW, pending the governor's signature. McClatchy's deployment of a Claude-powered content scaling agent triggered newsroom backlash over attribution, consent, and AI disclosure standards.

Training Infrastructure Frontier Model Releases GPT-5.5-Cyber Discord Claude Mythos +11 more

7Meta Ai Blog·May 18, 2026·source ↗

Meta Publishes Advanced AI Scaling Framework and Safety & Preparedness Report for Muse Spark

Meta has released an updated Advanced AI Scaling Framework that expands risk evaluation categories—including chemical/biological threats, cybersecurity, and loss-of-control risks—and introduces formal Safety & Preparedness Reports tied to specific model deployments. The first such report covers Muse Spark, Meta's advanced reasoning model, detailing pre- and post-safeguard evaluations across severe risk categories and ideological balance. Meta also describes a shift in safety methodology: rather than scenario-specific refusal training, Muse Spark is trained on the reasoning behind safety principles, enabling more generalizable behavior in novel situations. The framework applies across open, API, and closed deployments.

Frontier Model Releases Evaluation and Benchmarking Advanced AI Scaling Framework Meta AI Frontier AI Framework +6 more

9Meta Ai Blog·May 18, 2026·source ↗

Meta Introduces Muse Spark: First Model from Meta Superintelligence Labs with Multimodal Reasoning and Multi-Agent Orchestration

Meta has launched Muse Spark, the first model from its newly formed Meta Superintelligence Labs, positioned as a natively multimodal reasoning model with tool-use, visual chain-of-thought, and multi-agent orchestration capabilities. The model introduces 'Contemplating mode,' which runs multiple agents in parallel to compete with frontier reasoning modes, achieving 58% on Humanity's Last Exam and 38% on FrontierScience Research. Meta claims a greater than 10x compute efficiency improvement over Llama 4 Maverick through a rebuilt pretraining stack, and describes predictable scaling across pretraining, RL, and test-time reasoning axes. Muse Spark is available at meta.ai with a private API preview, and is framed as the first step on a scaling ladder toward 'personal superintelligence.'

Training Infrastructure Long Context Evolution Hyperion Meta AI Gemini Deep Think +14 more