Meta Muse Spark

ClosedMultimodal

Meta

Meta's flagship closed-source multimodal AI model with natively integrated text, image, and speech capabilities and multi-agent Contemplating mode.

Overview

Muse Spark is the first model from Meta Superintelligence Labs (MSL), the new research division led by Alexandr Wang. Internally codenamed "Avocado," it represents a clean-sheet architecture rebuilt from scratch over nine months — this is not a Llama derivative. MSL deliberately abandoned the Llama lineage to explore a fundamentally different approach to multimodal intelligence, one that natively fuses text, image generation, and speech synthesis into a single model rather than bolting modalities together post-training.

The most distinctive technical innovation is "thought compression," a reinforcement learning technique that trains the model to reach correct answers using fewer reasoning tokens. Where GPT-5.4 consumed 120M tokens and Claude used 157M tokens across the full evaluation suite, Muse Spark completed the same benchmarks with just 58M tokens — roughly 2-3x more efficient. This matters enormously for deployment at Meta's scale, where the model serves Facebook, Instagram, WhatsApp, Messenger, and Ray-Ban Meta smart glasses simultaneously.

Muse Spark operates in three distinct modes: Instant (fast responses, no extended reasoning), Thinking (single-agent chain-of-thought), and Contemplating (multi-agent parallel reasoning where several reasoning threads explore different solution paths concurrently). The Contemplating mode is particularly novel — it orchestrates multiple reasoning agents in parallel, then synthesizes their findings, achieving strong results on complex problems like HealthBench Hard (42.8%, #1 among all models) and CharXiv Reasoning (86.4%, also #1).

Notably, Muse Spark is entirely closed-source and consumer-only via meta.ai — a major departure from Meta's open-source tradition with Llama. There is no public API, no downloadable weights, and no pricing structure. This strategic pivot signals MSL's focus on direct consumer experiences over developer ecosystem building.

Release Date

2026-04-08

Parameters

Unknown

Context Window

262K tokens

Input Price

N/A

Output Price

N/A

Speed

Unknown

Benchmarks

BenchmarkScoreMax
AA Index52%100%
GPQA Diamond89.5%100%
SWE-bench Pro55%100%
HLE58%100%
HealthBench Hard42.8%100%

Capabilities

Multi-Agent Reasoning

Contemplating mode runs parallel reasoning threads that explore different solution paths simultaneously

Token Efficiency

Thought compression RL technique uses 58M tokens vs 120M (GPT-5.4) and 157M (Claude) for equivalent tasks

Medical Knowledge

#1 on HealthBench Hard at 42.8%, strongest medical reasoning among all frontier models

Visual Understanding

#1 on CharXiv Reasoning at 86.4%, excelling at chart and scientific figure interpretation

Native Multimodal

Text, image generation, and speech synthesis fused into a single architecture from the ground up

Consumer Scale

Deployed across Facebook, Instagram, WhatsApp, Messenger, and Ray-Ban Meta smart glasses

Getting Started

1

Visit meta.ai

Open meta.ai in any browser to start chatting immediately — no account required for basic access

2

Sign In for Full Features

Log in with your Meta account to unlock Contemplating mode and image generation

3

Choose Your Mode

Select Instant for quick answers, Thinking for step-by-step reasoning, or Contemplating for complex multi-step problems

4

Access via Meta Apps

Muse Spark is built into Facebook, Instagram, WhatsApp, and Messenger — tap the AI assistant icon in any app

Pros & Cons

Strengths

  • +Exceptional token efficiency (2-3x fewer tokens)
  • +Free consumer access via Meta AI
  • +#1 in health/medical (HealthBench Hard 42.8%)
  • +Strong visual understanding (CharXiv #1)
  • +Natively multimodal (text + image + speech)
  • +Multi-agent orchestration (Contemplating mode)

Weaknesses

  • -Closed source (major departure from Meta's open tradition)
  • -Coding gap vs Claude and GPT-5.4
  • -No public API or pricing
  • -Abstract reasoning weakness (ARC AGI 2: 42.5)
  • -Ecosystem lock-in (Meta account required)
  • -Benchmark trust concerns from Llama 4 history

Best For

Medical/health queriesVisual understandingCost-sensitive general queries (free)Multi-agent orchestration

Compare

Official Links