Anthropic announced Claude Mythos Preview on April 7, 2026 — and immediately declined to release it publicly. The benchmarks are extraordinary. The access is extraordinarily restricted. Here's what we know about the most capable Claude model ever built, why Anthropic is treating it differently from every model they've shipped before, and why the security community is genuinely alarmed about what it can do.
What Is Claude Mythos Preview?
Claude Mythos Preview is Anthropic's most advanced AI model to date, announced in early April 2026. Despite being called a "preview," Anthropic has no plans for a public release. Access is gated through Project Glasswing, a program that restricts Mythos to approximately 50 pre-screened partner organizations, including AWS, Apple, Google, and Microsoft.
The restriction isn't staged rollout strategy or hype management. Anthropic was explicit: Mythos Preview's capabilities in cybersecurity and autonomous reasoning cross a threshold where broad public access represents a meaningful risk. This is the first time a frontier AI lab has withheld a model on those grounds since OpenAI briefly restricted GPT-2 in 2019 — and GPT-2 looks trivial by comparison.
The model is available to approved partners through AWS Bedrock and Google Vertex AI (both gated), and via the Claude API by invitation only. There is no consumer product, no waitlist, and no announced timeline for broader availability.
The Benchmark Numbers
Anthropic published benchmarks comparing Mythos Preview against Claude Opus 4.6, currently one of the strongest publicly available models:
- SWE-bench Verified (autonomous software engineering): 93.9% vs Opus 4.6's 80.8%
- USAMO 2026 (advanced mathematics olympiad): 97.6% vs Opus 4.6's 42.3%
- Cybersecurity CTF: 83.1% vs Opus 4.6's 66.6%
The USAMO result is where your eyes should land first. Going from 42% to 97% on olympiad-level mathematics is not a marginal improvement on an already-strong baseline — it's a qualitative leap on problems that require extended chains of novel reasoning. USAMO problems aren't pattern-matching exercises. They require constructing original proofs that most professional mathematicians couldn't produce under time pressure. A 97.6% pass rate means near-perfect performance on problems designed to stump exceptional human minds.
The software engineering number is also striking. SWE-bench Verified tests a model's ability to resolve real GitHub issues on real codebases autonomously. 93.9% means Mythos can handle nearly any software engineering task it's given without human intervention — not assist, but autonomously complete.
Why the Security Community Is Alarmed
The zero-day discovery capability is where most of the serious concern is concentrated, and it's warranted. Here's why researchers and security professionals are treating this differently from previous AI capability announcements:
Autonomous offense at scale
Finding a single zero-day vulnerability typically requires weeks or months of work from a skilled human security researcher — deep familiarity with a target system's architecture, historical design decisions, and subtle interaction effects that don't appear in documentation. A model that can discover thousands of novel vulnerabilities autonomously doesn't just accelerate existing security research; it changes what's possible. Even a small number of malicious actors with access to Mythos-level capabilities could conduct vulnerability research at a pace and scale that current defensive infrastructure isn't designed to counter.
The OpenBSD bug is the specific proof point
OpenBSD is widely regarded as one of the most security-conscious operating systems in existence. Its codebase has been continuously audited by skilled human experts for decades, with security review baked into every line of code committed. A vulnerability surviving there for 27 years isn't a careless oversight — it's the kind of subtle, context-dependent flaw that requires genuine architectural reasoning to identify. The fact that Mythos found it suggests the model has real understanding depth, not surface-level pattern recognition. If Mythos can find that, the question becomes: what else is it finding in less rigorously audited codebases?
The exploit pipeline concern
Beyond finding vulnerabilities, security researchers have raised concerns about models capable of automatically generating functional exploits for discovered weaknesses. The dangerous scenario isn't just "AI finds a vulnerability" — it's "AI finds a vulnerability, writes a working exploit, and packages it for deployment," with the entire pipeline requiring minimal human involvement. Mythos's combination of security research capability and autonomous software engineering skill puts it close enough to that threshold that the concern is being taken seriously by people who work in offensive security professionally.
Critical infrastructure exposure
Much of the world's critical infrastructure — power grids, water treatment systems, financial networks, air traffic control, healthcare systems — runs on legacy software that was written before modern security practices existed and hasn't been comprehensively audited since. These systems have unknown vulnerabilities that human researchers haven't found because the systems are old, poorly documented, and not commercially interesting to audit. A model capable of autonomous, expert-level vulnerability research applied to critical infrastructure creates a risk surface that security agencies in multiple countries are actively discussing.
The offense/defense asymmetry problem
Cybersecurity has always been asymmetrically hard for defenders: an attacker needs to find one exploitable flaw; a defender needs to close all of them. AI-assisted offense is advancing faster than AI-assisted defense. Defensive applications — automated patch generation, vulnerability triage, security auditing — exist, but they're further behind. Releasing Mythos broadly before the defensive tooling catches up would widen a gap that's already concerning to the people responsible for protecting large-scale systems.
State actor and APT proliferation risk
Nation-state advanced persistent threat groups already conduct sophisticated cyber operations. The concern with broad AI access isn't that state actors couldn't develop comparable tools independently — they likely could, eventually. The concern is that broad commercial availability of Mythos-level capabilities would dramatically lower the barrier for less sophisticated actors — criminal organizations, smaller state programs, individual bad actors — to operate at threat levels previously reserved for well-resourced state programs. The explicit goal of Project Glasswing's 50-organization restriction is to prevent that kind of proliferation from happening through commercial channels.
Why Anthropic Restricted It — and Whether That's the Right Call
Anthropic's Responsible Scaling Policy has always included provisions for withholding models that exceed certain capability thresholds in high-risk domains. Those provisions were written before Mythos existed. The model triggered them during internal evaluation. Anthropic is framing this as policy working as designed, not as an emergency response to something unexpected.
Whether you find that reassuring or concerning probably depends on how much you trust Anthropic's judgment about where to draw the line — and whether you think 50 partner organizations is the right access level, versus broader (higher risk) or narrower (potentially leaving defensive applications underdeveloped). There's a reasonable argument that keeping Mythos restricted to well-resourced tech giants misses the opportunity for the broader security community to develop defensive tooling against these same capabilities. There's also a reasonable argument that the current restriction is the minimum necessary to prevent immediate misuse.
What's notable is that Anthropic published the benchmarks while restricting the model. The community knows what Mythos can do. The conversation about governance is happening in public, even if the model itself isn't public.
Pricing and the Enterprise Use Case
For the organizations that do have access, Mythos Preview is priced at $25 per million input tokens and $125 per million output tokens, compared to Claude Opus 4.6's $15/$75. This is firmly enterprise pricing — not a model you run for general chat, but one deployed for autonomous tasks where the quality delta justifies the cost: complex software engineering, advanced research synthesis, mathematical problem-solving at expert level, and security auditing under controlled conditions with appropriate access restrictions.
What This Means for AI Users Today
For most people — including most Deepest users — Mythos Preview is not accessible and won't be for the foreseeable future. The most capable Claude model you can actually use today is Claude Opus 4.6, available on Deepest alongside the full current frontier from OpenAI, Google, xAI, and others. The gap between Opus 4.6 and Mythos is real, but Opus 4.6 remains one of the strongest models available for the kinds of tasks most professionals are doing.
Mythos matters beyond its benchmarks, though. It establishes what's technically possible right now in 2026. Capabilities that appear in restricted models tend to migrate to public models over time — the autonomous reasoning and software engineering performance demonstrated by Mythos is likely to show up in publicly accessible models within the next 12–24 months. The cybersecurity capabilities may remain more tightly controlled given their dual-use nature, but the underlying intelligence that makes them possible is harder to contain.
The 27-year-old OpenBSD bug isn't just a headline — it's evidence that we're building systems capable of original expert-level work in high-stakes domains. The question of how that capability gets distributed, governed, and used is one of the defining questions of the next few years, and Anthropic's Project Glasswing is an early, imperfect attempt to answer it.
Frequently Asked Questions
Can I access Claude Mythos Preview?
Not through standard channels. Access requires an invitation through Anthropic's Project Glasswing program, currently limited to approximately 50 partner organizations. There's no public waitlist or announced timeline for broader availability. For the most capable Claude model currently accessible to the public, Claude Opus 4.6 is available on Deepest alongside all other major frontier models.
What specifically makes Mythos's security risk different from previous AI models?
The combination of three capabilities that haven't previously existed together: autonomous discovery of novel zero-day vulnerabilities at scale, expert-level software engineering that could generate functional exploits, and the reasoning depth demonstrated by the 27-year OpenBSD discovery. Previous models could assist security researchers; Mythos appears capable of conducting independent research at an expert level. That's a qualitative difference, not just a quantitative one.
Is Anthropic right to restrict it?
Reasonable people disagree. The case for restriction is straightforward: offensive cyber capabilities at this level cause asymmetric harm if widely available before defensive tooling catches up. The case against unrestricted access is that concentrating these capabilities in 50 large organizations isn't obviously safer than broader availability to the security community, and may actually slow defensive development. Anthropic's call is principled and consistent with their stated policy — whether it's the right call is a legitimate open question.
Will these capabilities eventually reach the public?
Historically, capabilities from restricted models migrate to public models over time. The autonomous reasoning and software engineering capabilities in Mythos are likely to appear in consumer-accessible models within 12–24 months. The most sensitive cybersecurity capabilities may remain more controlled, but the underlying intelligence driving them will eventually be present in broadly available models. The governance frameworks for handling that transition are still being figured out.