Blog

What is Claude Fable 5? Benchmarks and pricing

Claude Fable 5 launched June 9, 2026 as Anthropic's most capable public model. Here is what it is, what the benchmarks actually mean, how it compares to GPT-5.5 and Gemini 3.1 Pro, and which work it is built for.

Phos Team ·
Phos AI Labs AI Strategy AI Agents

For the last year, Anthropic’s most capable AI model was locked inside a restricted program called Project Glasswing, available only to vetted infrastructure providers and cybersecurity researchers.

On June 9, 2026, Anthropic released Claude Fable 5: a publicly available version of that Mythos-class technology, with safety classifiers that redirect the most sensitive requests to Claude Opus 4.8.

The benchmarks are the most decisive Anthropic has published in years. Whether they translate to your specific work is the question this article answers.


What Claude Fable 5 is: the model in plain language

The Mythos class explained

Anthropic introduced a new model tier called “Mythos” in April 2026. Mythos models are designed for capabilities above the Opus line, long-horizon autonomous tasks, complex agentic workflows, and extended reasoning on genuinely hard problems.

The Mythos class has two members as of June 9, 2026:

  • Claude Fable 5: the publicly available version. Available via the Claude API (model ID: claude-fable-5). Claude apps. Amazon Bedrock. Vertex AI, and Microsoft Foundry. Includes safety classifiers that redirect cybersecurity, biology, chemistry, and distillation queries to Opus 4.8.
  • Claude Mythos 5: the same underlying model without the biology and chemistry restrictions (cybersecurity restrictions remain). Available only through Project Glasswing, a restricted access program for vetted infrastructure providers. Not a general release.

“Fable” is the public version of Mythos; full capability; guardrails on for the highest-risk domains.

Key technical specifications

SpecClaude Fable 5
Model IDclaude-fable-5
Context window1 million tokens
Max output tokens128,000
Multimodal inputsText; images; files
Adaptive thinkingAlways on (no toggle)
Knowledge cutoffJanuary 2026
Launch dateJune 9, 2026
Safety fallbackOpus 4.8 on restricted requests
Data retention30 days (required for safety classifiers)
Zero data retentionNot available

One important note on the context window: Fable 5 uses the tokenizer introduced with Opus 4.7, which produces roughly 30% more tokens for the same text than older models. The 1 million token window is real; it fills faster than the number alone implies.


What Fable 5 is built for: the capability picture

Fable 5 is explicitly optimised for work that previous models could not sustain across a full task.

Long-horizon autonomous coding

The model can read a large codebase, identify issues, write fixes across multiple related files, and carry the solution to completion without human prompting at each step.

Stripe reported in early testing that Fable 5 performed a codebase-wide migration in a day that would otherwise have taken a whole team over two months by hand; that is the clearest available example of what “long-horizon” means in practice.

On SWE-Bench Pro — which measures end-to-end resolution of real GitHub issues — Fable 5 and Mythos 5 reach 80.3%, vastly outperforming OpenAI’s latest general model GPT-5.5, which scored 58.6%.

Complex document and knowledge work

Multi-document synthesis, compliance review, technical report generation, and senior-level financial reasoning are the core use cases. On Hebbia’s Finance Benchmark for senior-level reasoning, Fable 5 has the highest score of any model, with substantial gains in document-based reasoning, chart and table interpretation, and problem solving.

Vision and multimodal tasks

Fable 5 supports image and file inputs and can extract precise numbers from detailed scientific figures. It understands diagrams, charts, and tables nested in files and PDFs, opening up research and document-heavy work in finance, legal, analytics, architecture, and gaming.

Where Fable 5’s advantage is largest

The model’s lead over competitors and over Opus 4.8 grows with task complexity, task length, and the number of interdependent steps. Fable 5’s advantage is largest on hard, multi-step, autonomous work and smallest on tasks where all frontier models have converged.

On short, simple tasks, the performance gap narrows and the cost difference makes Opus 4.8 or Sonnet 4.6 the better choice.


Claude Fable 5 vs Claude Opus 4.8: the upgrade within the Claude family

The benchmark gap

BenchmarkClaude Fable 5Claude Opus 4.8Difference
SWE-Bench Pro80.3%69.2%+11.1pp
Context window1M tokens200K tokens5× larger
Max output tokens128K32K4× larger
Adaptive thinkingAlways onOptional toggleDifferent UX

The cost difference

Fable 5 costs $10/$50 per million input/output tokens. Opus 4.8 costs approximately $5/$25; half the price. For the same number of tokens, Fable 5 costs twice as much to run.

When the upgrade is justified

The upgrade from Opus 4.8 to Fable 5 earns its cost when the task:

  • Requires more than 200K tokens of context (whole-repository analysis, large document sets)
  • Benefits from more than 32K output tokens in a single response
  • Is a multi-step agentic workflow where the 11-point SWE-Bench Pro advantage compounds across iterations
  • Is so long and complex that the quality gap at each step makes a meaningful difference to the final output

The upgrade is not justified when:

  • The task fits comfortably in Opus 4.8’s context window
  • The task is short, single-step, or does not involve agentic reasoning
  • Cost is the primary constraint

The safety fallback implication

For cybersecurity, biology, chemistry, and distillation work, Fable 5 falls back to Opus 4.8 silently.

If the work is primarily in these domains; the effective model is Opus 4.8 at Fable 5 prices. This is a significant cost consideration for teams in regulated or research-heavy industries.


Claude Fable 5 vs GPT-5.5: the primary competitive comparison

GPT-5.5 was released by OpenAI on April 23, 2026, six weeks before Fable 5. It is the model the largest share of engineering teams currently use in production.

Benchmark comparison (vendor-reported)

BenchmarkClaude Fable 5GPT-5.5Winner
SWE-Bench Pro80.3%58.6%Fable 5 (+21.7pp)
GDPval-AA knowledge work1,9321,769Fable 5
Computer use85.0%78.7%Fable 5
GPQA Diamond (science)~92.6%93.6%Near parity
Context window1M tokens1M tokensTied

Pricing comparison

Claude Fable 5GPT-5.5
Input (per 1M tokens)$10.00$5.00
Output (per 1M tokens)$50.00$30.00
Cost differentialBaseline~50–60% cheaper

The honest summary

Fable 5’s lead over GPT-5.5 is 21.7 points; larger than the gap between GPT-5.5 and Gemini. That is the largest observed performance gap between two frontier models at this class level in any publicly reported benchmark at launch.

GPT-5.5 costs half as much.

For agentic coding plus the lowest hallucination rate, pick Claude Fable 5. For cheaper agentic work with native computer use, where you can tolerate a higher hallucination rate, pick GPT-5.5.

Independent testing is stark on the hallucination question. Fable 5 posts by far the lowest hallucination rate of the three on independent benchmark testing, at 36.18% versus Gemini’s 49.87% and GPT-5.5’s 85.53%.

The deployment decision

  • Choose Fable 5 for: end-to-end codebase resolution, multi-day autonomous workflows, tasks where the output quality gap at each step compounds significantly, work where wrong answers are expensive
  • Choose GPT-5.5 for: cost-sensitive production deployments, teams already embedded in the OpenAI ecosystem, tasks where 58.6% SWE-Bench performance is sufficient

Note: GPT-5.5’s launch benchmarks compared it against Claude Opus 4.7 (not Fable 5); so the cross-vendor comparison is directional rather than controlled.


Claude Fable 5 vs Gemini 3.1 Pro: the cost and breadth comparison

Gemini 3.1 Pro (Google) was released February 19, 2026, four months before Fable 5. It is positioned as the cost-competitive frontier model with the deepest Google Workspace and multimodal integration.

Benchmark comparison

BenchmarkClaude Fable 5Gemini 3.1 ProWinner
SWE-Bench Pro80.3%54.2%Fable 5 (+26.1pp)
GDPval-AA knowledge work1,9321,314Fable 5
GPQA Diamond (science)~92.6%94.3%Near parity

Pricing comparison

Claude Fable 5Gemini 3.1 Pro
Input (per 1M tokens)$10.00~$2.00
Output (per 1M tokens)$50.00~$8.00
Cost differentialBaseline~5× cheaper on input

Gemini 3.1 Pro is the cost leader at the frontier tier. Batch mode halves the cost further, making it competitive for very high-volume, lower-judgment use cases.

The deployment decision

  • Choose Fable 5 for: agentic coding and knowledge work where the 26-point SWE-Bench Pro gap is material, tasks requiring the longest context and highest output quality, work where hallucination rate is a primary risk
  • Choose Gemini 3.1 Pro for: cost-sensitive deployments, Google Workspace integration, multimodal breadth, high-volume commodity AI tasks

Where Fable 5 sits in the full 2026 model landscape

The Claude model family as of June 2026

ModelTierBest use caseRelative cost
Claude Fable 5Mythos-class (public)Long-horizon agentic coding and knowledge work$$$$
Claude Mythos 5Mythos-class (restricted)Cybersecurity; bio/chem research; Project Glasswing only$$$$
Claude Opus 4.8OpusComplex reasoning; writing; analysis$$
Claude Sonnet 4.6SonnetEveryday professional tasks; moderate complexity$
Claude Haiku 4.5HaikuHigh-volume; low-complexity; fast tasks¢

The full frontier comparison (June 2026)

ModelProviderSWE-Bench ProInput cost/1MBest for
Claude Fable 5Anthropic80.3%$10.00Hardest agentic coding and knowledge work
Claude Mythos 5Anthropic80.3%+$10.00Same; plus unrestricted bio/chem (vetted access)
Claude Opus 4.8Anthropic69.2%~$5.00Best Claude below Fable 5; cost-efficient quality
GPT-5.5OpenAI58.6%$5.00Mid-range agentic work; Codex ecosystem
Gemini 3.1 ProGoogle54.2%~$2.00Cost-sensitive; Google Workspace; high volume

On Artificial Analysis’s Intelligence Index, Fable 5 ranked 65; ahead of OpenAI’s GPT-5.5 at 60 and Google’s Gemini 3.1 Pro Preview at 57.


Pricing; availability; and the subscription window

API pricing

Claude Fable 5
Input$10 per million tokens
Output$50 per million tokens
Prompt cache reads~$1 per million tokens (90% discount)
Data retention30 days (mandatory for safety classifiers)
Zero data retentionNot available

The 30-day data retention requirement is a material consideration for regulated industries; healthcare, legal, finance, and government workflows. Fable 5 is not a zero-data-retention model.

Subscription availability

Through June 22, Fable 5 is included in Pro, Max, Team, and seat-based Enterprise plans at no extra cost.

On June 23, Anthropic will pull Fable 5 from those plans, requiring usage credits going forward, with plans to restore it as a standard subscription feature as soon as possible.

Platform availability

Claude API, Claude.ai, Claude Code, Claude desktop app, Amazon Bedrock (US East and Europe Stockholm at launch), Claude Platform on AWS, Vertex AI, Microsoft Foundry, and GitHub Copilot.

GitHub Copilot note: Fable 5 in Copilot requires up to 30 days of prompt-and-output retention to run the safety classifiers; this setting is off by default for Copilot admins and must be enabled explicitly.


The practical decision framework: which model for which work

Choose Claude Fable 5 when:

  • The task involves resolving real software engineering issues end-to-end across a large codebase
  • The workflow requires more than 200K tokens of context in a single session
  • The task involves multi-day or multi-step autonomous execution where per-step quality compounds
  • The output needs to be 128K tokens or more in a single response
  • Hallucination rate is a primary constraint (Fable 5 hallucinates meaningfully less than GPT-5.5 and Gemini 3.1 Pro on complex tasks, per independent testing)

Choose Claude Opus 4.8 when:

  • The task fits in 200K tokens of context
  • Cost is a constraint and the 11-point SWE-Bench Pro gap is not material to the specific work
  • The task involves restricted domains (cybersecurity, biology, chemistry) where Fable 5 falls back to Opus 4.8 anyway

Choose GPT-5.5 when:

  • The team is already embedded in the OpenAI ecosystem (Codex, existing production infrastructure)
  • The task does not require Fable 5’s ceiling and cost is a primary constraint
  • The specific use case benefits from GPT-5.5’s native computer use implementation

Choose Gemini 3.1 Pro when:

  • Cost is the primary constraint and volume is high
  • The work is primarily within the Google ecosystem (Workspace, Docs, Sheets, Drive)
  • The task is multimodal-heavy in ways that benefit from Google’s integration

For mid-market business workflows

For a $5M–$25M non-tech company evaluating which model to use in a shared AI workspace: Claude Sonnet 4.6 or Opus 4.8 remain the right default for most business workflows; proposals, client communications, reporting, analysis.

Fable 5’s capabilities are most relevant for technical teams working on software and for research-intensive tasks that genuinely exceed what Opus 4.8 can sustain across a full task.

At $50 per million output tokens; it is not the right model for high-frequency; moderate-complexity business workflows.


Common questions on Claude Fable 5

”Is Claude Fable 5 available in Claude.ai?”

Yes. Claude Fable 5 is available on the Claude Platform natively, through available marketplaces, and in Amazon Web Services, Google Cloud, and Microsoft Foundry.

It is available through Claude.ai on Pro, Max, Team, and Enterprise plans during the introductory window through June 22, 2026. After that date, usage credits are required.

”What is the difference between Claude Fable 5 and Claude Mythos 5?”

Fable 5 and Mythos 5 share the same underlying architecture. The difference is the safety classifiers.

Claude Fable 5 includes safeguards that limit its performance in specific areas where misuse risk is elevated. Harmful prompts related to cybersecurity, biology, chemistry, and health fall back to receive a response from Opus 4.8 instead.

The same model without these limits is Claude Mythos 5, available only to a small group of vetted customers.

”Can I use Claude Fable 5 for healthcare data?”

Not under zero data retention terms. Fable 5 requires 30-day data retention for the safety classifiers to function. HIPAA-covered entities and organisations that require ZDR policies cannot use Fable 5 in those contexts.

For healthcare work that does not require ZDR, check your organisation’s data processing agreement with Anthropic before deploying.

”Is Claude Fable 5 better than GPT-5.5 for writing?”

For complex, long-form analytical writing (financial reports, technical documentation, multi-document synthesis), Fable 5’s knowledge work benchmark advantage (GDPval-AA: 1,932 vs 1,769) and lower hallucination rate suggest better output on genuinely hard writing tasks.

For shorter, simpler writing tasks, the gap narrows and the cost difference becomes the more relevant factor. Claude Opus 4.8 or Sonnet 4.6 at lower cost are typically sufficient for everyday business writing workflows.

”What happens when Fable 5 hits a restricted request?”

Queries in restricted domains are automatically routed to Opus 4.8 if flagged by the safety classifiers. You won’t be charged Fable prices for rerouted requests.

The fallback happens silently from the user’s perspective. The response comes from Opus 4.8; the charge is at Opus 4.8 rates.

”When will Fable 5 be included in standard Claude subscription plans?”

Anthropic has stated the intent to restore Fable 5 as a standard subscription feature as soon as possible, with no specific date confirmed.

The introductory free-inclusion window runs through June 22, 2026. After that date, usage credits are required until the plan pricing is updated.


Evaluating which AI model belongs in your company’s shared workspace?

Claude Fable 5 is the most capable model Anthropic has ever made publicly available; and the benchmark lead on agentic coding is decisive, not close.

The honest practical question is whether the specific work requires that ceiling. For the hardest long-horizon coding and knowledge work, Fable 5 is the current leader.

For cost-sensitive production deployments of moderate-complexity tasks, GPT-5.5 at half the price or Opus 4.8 at a quarter of the output cost remain the right choices.

The 2026 frontier is genuinely tiered; and the Fable 5 tier earns its price on the specific work it was built for.

For technical teams: the benchmark table above gives the direct comparison. The hallucination rate data and SWE-Bench Pro lead make Fable 5 the clear choice for the hardest autonomous coding and knowledge work.

For business founders: model selection is one decision inside a broader AI strategy. The right model for a $15M professional services firm’s daily workflows is different from the right model for a software team’s codebase resolution. Phos AI Labs works with mid-market companies to identify the right model tier for each workflow type, and to build the shared workspace infrastructure that makes any model produce company-specific outputs rather than generic ones. We have run 400+ AI engagements. Clients include Zapier, Coca-Cola, Medtronic, Dataiku, and American Express. Thirty minutes, no deck. Start here.

The fastest way to know whether we're the right fit, is a conversation.

STEP 1/2 · ABOUT YOU