Claude Fable 5: what we know so far

Resources
Opinion
Claude Fable 5: what we know so far

Swarm Intelligence banner with redacted text

Yesterday Anthropic released Claude Fable 5 to the public. If you follow frontier AI, you know why this matters: Fable is the first publicly accessible version of Anthropic’s Mythos-class model, the tier they initially decided was too capable to release at all.

Two months ago, Mythos Preview was made available to a small, vetted group of cyberdefenders and critical infrastructure providers through Project Glasswing. Yesterday, a version of the same underlying model went live for anyone with a Claude subscription.

That’s the headline. Here’s what sits underneath it, and what we know so far.

What Fable actually is

Fable 5 and Mythos 5 run on the same underlying model. The distinction between them is not capability; it’s permission. Anthropic built Fable by attaching safety classifiers to Mythos 5: separate AI systems that monitor each incoming request and redirect anything touching cybersecurity, biology, chemistry, or model distillation to Claude Opus 4.8 instead. The user is told when this happens. Anthropic says the fallback fires in fewer than 5% of sessions overall.
For a general knowledge worker, that 5% is nearly invisible. For anyone running security work, that 5% is the job.

It’s too early to tell its impact

We’ve been putting Fable through its paces since launch. The issue isn’t that the model refuses outright; it’s that Anthropic tuned the classifiers conservatively by design, and they catch far more than genuine misuse attempts, limiting its meaningful usage for cybersecurity testing. Anthropic says this plainly in their own launch post: all cybersecurity queries will trigger the fallback.

Ask Fable to reason through an attack chain, analyze malicious code for defensive purposes, or assist with anything resembling an offensive security task, and you frequently get Opus 4.8 instead. That’s a good model, but it’s not the model you’re trying to evaluate.

A fair assessment of a model’s security capabilities requires the model to actually engage with security questions. Fable, structurally, won’t. We’re not complaining about that design decision. Anthropic has published their reasoning, and it holds together.

On the internals

The security community doesn’t need to speculate about what’s inside Fable. The full system prompt, roughly 120,000 characters, was published on GitHub publicly within hours of launch by a security researcher. Anyone who wants to read Anthropic’s behavioral instructions in full can do so right now. We can verify and confirm the findings.

The architecture of the restrictions is more revealing than the restrictions themselves. Where the guardrails sit, what logic drives the fallback, and what the model understands about its own constraints tells you quite a bit about the underlying design priorities.

What’s actually worth watching

The model that matters for security work is Mythos 5, not Fable. Right now, access to Mythos is limited to Project Glasswing partners, a select group of cyber defenders working in coordination with the US government. Anthropic has said it will expand Mythos 5 access through a broader trusted access program, but the details haven’t been published yet.

When that access expands, the real evaluation can start.

Mythos Preview demonstrated genuine capability in autonomous exploit discovery and chained vulnerability analysis. Whether the full Mythos 5 represents a material step-change for professional offensive security, or whether the current reputation is outrunning what the model can actually deliver on realistic red team tasks, is a question we intend to answer properly.