Almanac
Guide · In-depth

Meta AI: From Open-Weights Pioneer to Closed-Model Contender

MetaIn-depthactive·v1 · live·generated 5d ago
TL;DRMeta built its AI identity on the Llama open-weights lineage — a multi-year cadence of increasingly capable, freely available models that reshaped the competitive landscape. That identity is now in tension: the formation of Meta Superintelligence Labs and the launch of Muse Spark, a closed-weights reasoning model, signals a deliberate pivot toward competing directly with OpenAI, Google, and Anthropic on proprietary frontier terms. The company is simultaneously scaling custom silicon, private power infrastructure, and safety frameworks to support whatever comes next.

Key takeaways

  • Llama has progressed through four major generations (Llama 2 → 3 → 3.1/3.2/3.3 → 4), adding multimodality, MoE architecture, and multilingual support at every step.
  • Muse Spark — the first product of Meta Superintelligence Labs — is closed-weights, withholds architecture details, and claims 10x+ compute efficiency over Llama 4 Maverick, marking a sharp break from Meta's open-weights identity.
  • Muse Spark ranks fourth on the Artificial Analysis Intelligence Index and scores 58% on Humanity's Last Exam, but trails on coding and agentic benchmarks.
  • Meta's MTIA chip roadmap (generations 300–500, co-developed with Broadcom) targets a 25x compute FLOPS improvement from MTIA 300 to 500, with mass deployment of MTIA 450/500 scheduled for 2027.
  • A real-world security failure — attackers prompting Meta's AI customer support agent to hijack Instagram accounts — illustrates the deployment risks of AI agents with account-management privileges.
  • Meta accounts for roughly 16% of automated AI internet traffic, behind OpenAI's ~69% but ahead of Anthropic's ~11%.

What Meta is in AI

Meta occupies a structurally unusual position in the AI landscape: it is simultaneously one of the largest consumer-facing AI deployers on the planet (through its social platforms and the meta.ai assistant), the dominant force in open-weights model distribution, and — as of 2026 — a new entrant in the closed-weights frontier race. Its AI output spans large language models, vision-language models, audio separation, video segmentation, safety classifiers, custom silicon, and private power infrastructure.

The Llama lineage: building the open-weights standard

Meta's most durable contribution to the field is the Llama model family, which has run through four major generations since 2023 and become the de facto base for open-weights fine-tuning, research, and deployment worldwide.

Llama 2 (July 2023) established the template: multiple parameter sizes, base and instruction-tuned variants, and broad distribution via Hugging Face with Microsoft as a distribution partner. Code Llama (August 2023) extended the line into code specialization with long-context support.

Llama 3 (April 2024) improved across the board over Llama 2. Llama 3.1 (July 2024) pushed to 405B parameters — Meta's largest open-weights release at the time — with multilingual support and extended context windows, positioning it as a frontier-class open model. Llama 3.2 (September 2024) was the first multimodal Llama release, adding vision-language models at 11B and 90B scales alongside 1B and 3B edge variants for on-device inference. Llama 3.3 70B (November 2024) refined the instruction-tuned tier, accumulating over 691,000 downloads on Hugging Face.

Llama 4 (April 2025) introduced mixture-of-experts (MoE) architecture across the lineup. Maverick (17B active parameters, 128 experts) and Scout (17B active parameters, 16 experts) are both multimodal and multilingual, with Maverick seeing 28K+ downloads and Scout over 420K within days of release. The Llama Guard 4 12B safety classifier, built on the Llama 4 architecture, ships alongside for conversational safety filtering.

The pivot: Meta Superintelligence Labs and Muse Spark

The formation of Meta Superintelligence Labs and the April 2026 launch of Muse Spark represent the sharpest strategic inflection in Meta's AI history. Muse Spark is closed-weights — Meta withheld parameter count, architecture, and training details — and is positioned as a direct competitor to OpenAI, Google, and Anthropic's proprietary frontier models.

Technically, Muse Spark is a natively multimodal reasoning model supporting tool use and multi-agent orchestration. Its "Contemplating mode" runs multiple agents in parallel to compete with frontier reasoning modes. It claims 58% on Humanity's Last Exam and 38% on FrontierScience Research, ranks fourth on the Artificial Analysis Intelligence Index, and achieves Llama 4 Maverick-level capability with over 10x less training compute — attributed to a rebuilt pretraining stack and a "thought compression" post-training technique using RL to penalize excessive reasoning tokens. Gaps remain: the model trails on coding and agentic benchmarks.

Muse Spark is available at meta.ai with a private API preview, framed explicitly as "the first step on a scaling ladder toward personal superintelligence."

Safety framework and deployment risks

Meta published an updated Advanced AI Scaling Framework alongside Muse Spark, expanding risk evaluation categories to include chemical/biological threats, cybersecurity, and loss-of-control risks, with formal Safety & Preparedness Reports tied to specific model deployments. Notably, Muse Spark is trained on the reasoning behind safety principles rather than scenario-specific refusal patterns — a methodology Meta argues produces more generalizable behavior in novel situations.

The gap between framework and deployment reality was exposed in June 2026 when attackers successfully prompted Meta's AI customer support agent to link Instagram accounts to attacker-controlled email addresses, hijacking accounts including the dormant Obama White House Instagram. The incident is a textbook prompt-injection / social-engineering failure in a live consumer product with account-management privileges — a category of risk that safety frameworks must address at the deployment layer, not just the model layer.

Research has also flagged alignment fragility in the Llama line: a study of Llama 3.1 8B found that RLHF alignment does not remove partisan political geometry from the model but instead compresses output variance, with the underlying structure remaining reactivatable — a pattern the authors suggest may generalize beyond political orientation.

Infrastructure: silicon and power

Meta is not content to depend on third-party hardware. Its MTIA chip roadmap spans four generations (300, 400, 450, 500), co-developed with Broadcom. MTIA 300 is in production for ranking/recommendation training; MTIA 400 is entering deployment; MTIA 450 and 500 target GenAI inference and are scheduled for mass deployment in early 2027 and 2027 respectively. The roadmap claims a 4.5x HBM bandwidth increase and 25x compute FLOPS improvement from generation 300 to 500.

On power, Meta is building private gas-fired plants in Ohio and Texas to directly supply data centers, bypassing public utilities — part of a broader industry shift (46 such projects identified in one study, 90% announced in 2025) that is causing Meta and peers to miss earlier greenhouse gas reduction pledges.

Ecosystem and research footprint

Beyond models, Meta's AI output includes: SAM 3.1 (Segment Anything Model, tracking up to 16 objects at 32 FPS on a single H100); SAM Audio (multimodal audio separation via text, visual, and temporal prompts); torchtune (a PyTorch-native post-training library benchmarked against Axolotl and Unsloth); and an acquisition of Moltbook, an agent-to-agent social platform. Meta also co-developed an augmented-reality headset with Anduril for military use, integrating drone-strike ordering via eye-tracking and voice commands.

Meta accounts for approximately 16% of automated AI internet traffic — second to OpenAI's ~69% and ahead of Anthropic's ~11% — reflecting the scale of its consumer deployment surface.

Where it's heading

The Muse Spark launch and Superintelligence Labs formation signal that Meta is no longer content to lead only in open weights. The closed-weights tier gives it a commercial product that can compete on frontier benchmarks without revealing architectural advantages to competitors. Whether the open-weights Llama program continues at the same cadence — or becomes a lower tier in a two-track strategy — is the central strategic question the events in this bundle leave open.

Meta AI model lineage and strategic pivot

Meta's Llama lineage at a glance

ModelReleaseArchitectureNotable capabilityWeights
Llama 22023-07Dense transformerBase + chat variants; Microsoft distributionOpen
Code Llama2023-08Llama 2 fine-tuneCode specialization, long contextOpen
Llama 32024-04Dense transformerImproved over Llama 2 across benchmarksOpen
Llama 3.1 405B2024-07Dense transformerFrontier-class open model, multilingual, long contextOpen
Llama 3.2 (11B/90B Vision)2024-09Dense + visionFirst open-weights multimodal Llama; 1B/3B edge variantsOpen
Llama 3.3 70B2024-11Dense transformerInstruction-tuned; 691K+ HF downloadsOpen
Llama 4 Maverick 17B-128E2025-04MoE multimodal128 experts, image-text-to-text, multilingualOpen
Muse Spark2026-04Undisclosed58% Humanity's Last Exam; 4th on AI Analysis Index; closedClosed

Synthesized from the events bundle; unknown cells render —.

Timeline

  1. Llama 2 released with Microsoft distribution partnership

  2. Llama 3.1 405B released — Meta's largest open-weights model to date

  3. Llama 3.2 launches Meta's first open-weights multimodal models

  4. Llama 4 Maverick and Scout debut MoE architecture in open-weights lineup

  5. Muse Spark launched — first closed-weights model from Meta Superintelligence Labs

  6. Meta AI customer support agent exploited to hijack Instagram accounts

Related topics

LlamaMuse SparkMeta Superintelligence LabsLlama-4-MaverickHugging FaceOpenAIAnthropicNVIDIA

FAQ

Has Meta abandoned open weights?

Not entirely — the Llama 4 family (Maverick, Scout) remains open-weights and actively maintained. Muse Spark is a closed-weights departure for Meta's highest-capability frontier tier, but the open-weights program continues in parallel.

What is Meta Superintelligence Labs?

It is a newly formed internal unit at Meta responsible for Muse Spark, Meta's first closed-weights frontier reasoning model, framed as the first step on a scaling ladder toward 'personal superintelligence.'

What happened with the Meta AI Instagram hack?

Attackers prompted Meta's deployed AI customer support agent to link Instagram accounts to attacker-controlled email addresses, successfully hijacking accounts including the dormant Obama White House Instagram — a concrete prompt-injection / social-engineering failure in a live consumer product.

What is MTIA and why does it matter?

MTIA (Meta Training and Inference Accelerator) is Meta's custom AI chip line, co-developed with Broadcom across four generations (300–500), targeting a 25x FLOPS improvement from generation 300 to 500 and reducing dependence on third-party silicon for inference and training workloads.

How does Muse Spark compare to Llama 4 Maverick?

Meta claims Muse Spark matches Llama 4 Maverick's capabilities with over 10x less training compute, adds multimodal reasoning and multi-agent orchestration, and ranks fourth on the Artificial Analysis Intelligence Index — but trails on coding and agentic benchmarks and withholds architecture details.

Stay current

Call Me Almanac pairs the week's AI news with guides like this one — Midweek & Sunday.

Versions

  • v1live5d ago

Related guides (4)

More on Meta (6)

7Meta Ai Blog·1mo ago·source ↗

Meta Introduces SAM Audio: Unified Multimodal Model for Audio Separation with PE-AV, Benchmark, and Judge Model

Meta has released SAM Audio, a unified multimodal audio separation model that accepts text, visual, and temporal span prompts to isolate sounds from complex audio mixtures. The system is powered by Perception Encoder Audiovisual (PE-AV), an extension of Meta's open-source Perception Encoder released earlier in 2025, and uses a flow-matching diffusion transformer architecture. Alongside the model, Meta is releasing SAM Audio-Bench (the first in-the-wild audio separation benchmark) and SAM Audio Judge (an automatic evaluation model for audio separation). All components are available today via the Segment Anything Playground.

9Meta Ai Blog·1mo ago·source ↗

Meta Introduces Muse Spark: First Model from Meta Superintelligence Labs with Multimodal Reasoning and Multi-Agent Orchestration

Meta has launched Muse Spark, the first model from its newly formed Meta Superintelligence Labs, positioned as a natively multimodal reasoning model with tool-use, visual chain-of-thought, and multi-agent orchestration capabilities. The model introduces 'Contemplating mode,' which runs multiple agents in parallel to compete with frontier reasoning modes, achieving 58% on Humanity's Last Exam and 38% on FrontierScience Research. Meta claims a greater than 10x compute efficiency improvement over Llama 4 Maverick through a rebuilt pretraining stack, and describes predictable scaling across pretraining, RL, and test-time reasoning axes. Muse Spark is available at meta.ai with a private API preview, and is framed as the first step on a scaling ladder toward 'personal superintelligence.'

7Meta Ai Blog·1mo ago·source ↗

Meta Publishes Advanced AI Scaling Framework and Safety & Preparedness Report for Muse Spark

Meta has released an updated Advanced AI Scaling Framework that expands risk evaluation categories—including chemical/biological threats, cybersecurity, and loss-of-control risks—and introduces formal Safety & Preparedness Reports tied to specific model deployments. The first such report covers Muse Spark, Meta's advanced reasoning model, detailing pre- and post-safeguard evaluations across severe risk categories and ideological balance. Meta also describes a shift in safety methodology: rather than scenario-specific refusal training, Muse Spark is trained on the reasoning behind safety principles, enabling more generalizable behavior in novel situations. The framework applies across open, API, and closed deployments.

7Meta Ai Blog·1mo ago·source ↗

Meta Announces Four MTIA AI Chip Generations in Two Years: MTIA 300–500 Roadmap

Meta has detailed a rapid four-generation MTIA chip roadmap (300, 400, 450, 500) developed in partnership with Broadcom, spanning ranking/recommendation inference and training through general GenAI workloads. Key advances include a 4.5x HBM bandwidth increase and 25x compute FLOPS improvement from MTIA 300 to 500, with MTIA 450 and 500 targeting GenAI inference with doubled and further-increased HBM bandwidth versus leading commercial products. MTIA 300 is in production for R&R training, MTIA 400 is lab-tested and entering deployment, while MTIA 450 and 500 are scheduled for mass deployment in early 2027 and 2027 respectively. The strategy emphasizes modular chiplet design and short iteration cycles to keep hardware aligned with rapidly evolving AI model requirements.

8The Batch·19d ago·source ↗

Meta Introduces Muse Spark: First Closed-Weights Model from Superintelligence Labs

Meta released Muse Spark, its first AI model in roughly a year and the debut product of its Superintelligence Labs, marking a significant departure from its open-weights Llama strategy. The natively multimodal reasoning model supports tool use and multi-agent orchestration, achieves fourth place on the Artificial Analysis Intelligence Index, and claims notable token efficiency—matching Llama 4 Maverick with over 10x less training compute. Meta withheld parameter count, architecture, and training details, positioning Muse Spark as a closed commercial product competing with OpenAI, Google, and Anthropic. The release introduces 'thought compression' via RL and a parallel multi-agent 'contemplating' mode, while showing gaps in coding and agentic benchmarks.

7The Batch·19d ago·source ↗

Meta Pivots to Closed Weights with Muse Spark; The Batch Issue 349 Roundup

Meta introduced Muse Spark, its first AI model in roughly a year and the first product from its Superintelligence Labs, marking a pivot away from its open-weights strategy toward a closed model. Muse Spark is a natively multimodal reasoning model supporting tool use and multi-agent orchestration, with three reasoning modes and a novel 'thought compression' post-training technique using RL to penalize excessive reasoning tokens. The model ranks fourth on the Artificial Analysis Intelligence Index and matches Llama 4 Maverick's capabilities with over an order of magnitude less training compute, though it trails in coding and agentic benchmarks. The issue also covers broader industry themes including AI-native software engineering team structures, big pharma AI adoption, and regulatory developments.