Technology ◆ AI Safety · April–May 2026

§ Tech Intelligence / Mythos Preview

Anthropic Built an AI That Can Break Into Anything. Then Someone Broke Into That.

Claude Mythos Preview can find and exploit zero-day vulnerabilities in every major operating system and web browser — capabilities Anthropic called too dangerous to release publicly. They locked it behind a vetted partner program. Fourteen days later, a Discord group got in by guessing the URL.

181×

Firefox exploits developed

Mythos vs 2 for Claude Opus 4.6 — same task, several hundred attempts each

99%

Vulnerabilities still unpatched

Of all zero-days Mythos found in real codebases as of April 2026

$100M

Committed to Project Glasswing

Usage credits for partner organizations + $4M to open-source security orgs

Editorial illustration: Anthropic Mythos — an AI that can break into anything, and then someone broke into that

§ 01 / The Capability

It found a 27-year-old bug in OpenBSD. Then a 16-year-old one in FFmpeg. Then a 17-year-old one in FreeBSD. It wrote the working exploit for each.

Claude Mythos Preview, announced April 7, 2026, is a general-purpose language model that Anthropic says is “substantially beyond those of any model we have previously trained.” Most large language models are benchmarked on math, coding, and reasoning. Mythos was benchmarked differently: against real open-source codebases, tasked with finding and exploiting vulnerabilities that human engineers hadn’t found yet.

The results, documented by Anthropic’s own red team, are specific. Mythos produced working Firefox JavaScript engine exploits 181 times across several hundred attempts. The previous frontier model, Claude Opus 4.6, produced two. Mythos achieved a full control-flow hijack — the deepest class of exploit — on ten separate, fully patched OSS-Fuzz targets. Opus 4.6 reached tier 3 vulnerability severity once.

Among Mythos’s specific discoveries: a 27-year-old bug in OpenBSD’s SACK protocol enabling remote denial-of-service; a 16-year-old vulnerability in FFmpeg’s H.264 codec; a 17-year-old FreeBSD NFS remote code execution flaw granting unauthenticated root access (CVE-2026-4747); and memory corruption bugs in cryptography libraries affecting TLS, AES-GCM, and SSH. Anthropic engineers with no formal security background were asking Mythos to find remote code execution vulnerabilities overnight and waking up to a complete, working exploit.

Capability Benchmark

Mythos Preview vs. Claude Opus 4.6

Cybersecurity Vulnerability Reproduction

Mythos

83.1%

Opus 4.6

66.6%

Internal Anthropic benchmark · Source: red.anthropic.com/2026/mythos-preview

Expert Hacking Task Completion (U.K. AI Security Institute)

Mythos

73%

Opus 4.6

~2%

Prior to April 2025, no AI model could complete such tasks · AISI evaluation

Firefox Zero-Day Exploit Development (out of ~200 attempts)

Mythos

181 / ~200

Opus 4.6

2 / ~200

Absolute figures: 181 working exploits (Mythos) vs 2 (Opus 4.6) · Same test conditions

Mythos Preview

Claude Opus 4.6

Anthropic notes the capabilities were not explicitly trained. They “emerged as a downstream consequence of general improvements in code, reasoning, and autonomy” — meaning the exploit-development ability is a side effect of making the model better at programming, not a deliberate design choice.

§ 02 / The Decision

The first AI withheld from public release since OpenAI held back GPT-2 in 2019.

Anthropic’s stated reasoning: Mythos poses significant near-term security risk during a “transitional period.” Defenders and attackers both benefit from a model that can find and exploit vulnerabilities at this level — but in the immediate window before defensive infrastructure catches up, attackers hold the advantage. Releasing Mythos to the public before that window closes would, in Anthropic’s assessment, be irresponsible.

Context

Not since OpenAI temporarily withheld GPT-2 in 2019 — citing concerns about synthetic text misuse — has a major AI lab deemed a frontier model too dangerous to release publicly. That decision was later reversed as GPT-2 proved less catastrophic than feared. Anthropic has not set a timeline for reversing this one. Over 99% of the vulnerabilities Mythos found remain unpatched.

Anthropic’s long-term projection: defenders will ultimately benefit more than attackers. Security teams can use models like Mythos to find and fix bugs before new code ships. Patch cycles can shorten. Incident response can be automated. But that future is contingent on defenders adopting these tools first — and most organizations aren’t there yet.

“AI capabilities have crossed a threshold that fundamentally changes the urgency required to protect critical infrastructure.”
Cisco executive — Project Glasswing launch statement · April 7, 2026

§ 03 / Project Glasswing

Eleven launch partners. Forty-plus organizations. A $100 million commitment. A controlled channel while the rest of the world waits.

Instead of a public release, Anthropic launched Project Glasswing: a vetted access program giving selected organizations the ability to use Mythos for defensive security work. The stated goal is to enable defenders to begin securing the most critical systems before similar models become widely available — and to establish coordinated vulnerability disclosure standards before the next generation of AI makes the current ad-hoc process unworkable.

Project Glasswing — Launch Partners

Amazon Web Services

Apple

Broadcom

Cisco

CrowdStrike

Google

JPMorganChase

Linux Foundation

Microsoft

NVIDIA

Palo Alto Networks

Additionally: 40+ organizations maintaining critical software infrastructure — full list not published · Source: anthropic.com/glasswing

$100M

Mythos Preview usage credits

$2.5M

Alpha-Omega + OpenSSF via Linux Foundation

$1.5M

Apache Software Foundation

Anthropic has also committed to a Cyber Verification Program that will allow legitimate security professionals outside the initial partner group to apply for appropriate access. Post-preview pricing is set at $25 per million input tokens and $125 per million output tokens — significantly above standard rates, reflecting the controlled-access model.

§ 04 / The Breach

They didn’t need a sophisticated exploit. They guessed the URL.

Fourteen days after Anthropic locked Mythos behind a vetted partner program, Bloomberg reported that an unauthorized Discord community had already been using it regularly. The group — described as “non-malicious and driven largely by curiosity” — gained access through a third-party vendor environment. One member was a contractor for Anthropic. Another key factor: they guessed the URL based on Anthropic’s naming conventions. No sophisticated exploit required.

Late March 2026

Data leak reveals Mythos exists

Before any official announcement, a data leak exposes Anthropic's testing of a new model described internally as a 'step change in capabilities.' Fortune reports on it first.

April 7, 2026

Official launch + Project Glasswing

Anthropic announces Claude Mythos Preview and Project Glasswing simultaneously. Access is restricted to 11 launch partners (Microsoft, Google, Apple, AWS, JPMorgan Chase, Nvidia, and others) plus 40+ organizations. The model is not publicly available.

April 17, 2026

White House meeting

CEO Dario Amodei meets with White House Chief of Staff Susie Wiles and Treasury Secretary Scott Bessent. Talks described as 'productive and constructive.' German banks and the Bank of England begin separate risk assessments.

April 21, 2026

Bloomberg: Discord group was already inside

Bloomberg reports that a Discord community focused on unreleased AI models gained access through a third-party vendor environment. The group didn't use a sophisticated exploit — they guessed the URL based on Anthropic's naming conventions. A member was a contractor for Anthropic. The group had been regularly using Mythos since launch day.

April 22–24, 2026

Coverage erupts — 'hackers breach'

Euronews, TechCrunch, CyberNews, Fortune, and the Washington Post all cover the breach within 48 hours. Washington Post: 'AI hacking fears jolt Washington as Anthropic unveils Mythos.' Anthropic confirms it is investigating but states: 'There is currently no evidence that Anthropic's systems are impacted, nor that the reported activity extended beyond the third-party vendor environment.'

Anthropic Statement

“There is currently no evidence that Anthropic’s systems are impacted, nor that the reported activity extended beyond the third-party vendor environment. We are investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments.”

§ 05 / Expert Reactions

Security researchers agree the capabilities are real. They disagree about what that means.

No serious researcher disputes Mythos’s capabilities. The benchmarks are Anthropic’s own, but they are specific, falsifiable, and backed by named CVEs and real codebases. The debate is about consequences.

“A large fraction of cybersecurity professors believe this is pretty much what was expected.”
Peter Swire, Georgia Tech · Scientific American, April 2026

“It's a big deal, but it's unlikely to prove to be the end of the world.”
Ciaran Martin, Oxford University · Scientific American, April 2026

Both Swire and Martin noted that regulatory alarm partly reflects institutional incentives — agencies and advisory bodies benefit from emphasizing risk. Martin suggested that the actual expected harm was lower than the most alarming predictions. A Fortune analysis reached a similar conclusion from the industry side: the real bottleneck isn’t finding vulnerabilities, it’s fixing them. Patch cycles at most organizations are measured in months or years, not hours. Until that changes, the leverage from Mythos-class capabilities may be more limited than feared.

“I don't want AI turned on our own people.”
Dario Amodei, Anthropic CEO · Financial Times Weekend, 2026

David Sacks, in early commentary, speculated that Anthropic’s real motive for withholding Mythos was a compute shortage — “by holding it back, they create this impression of scarcity and altruism.” He later gave Anthropic the benefit of the doubt on the cybersecurity concerns. The White House meeting with Dario Amodei — attended by Chief of Staff Susie Wiles and Treasury Secretary Scott Bessent — suggests the administration takes the risk seriously regardless of Sacks’s initial skepticism.

§ 06 / The Threat Landscape

Mythos didn’t create the AI cyberthreat era. It arrived in the middle of one.

Before Mythos was announced, AI-powered attacks were already accelerating. IBM’s X-Force 2026 Threat Index found a 49% increase in active ransomware groups year-over-year, driven partly by AI tooling that allows smaller, less technically sophisticated operators to run credible campaigns. Attackers are using AI to scan networks at 36,000 probes per second, generate phishing emails (82.6% are now AI-generated per IBM), and compress dwell time inside compromised networks from 9 days to 5.

49%

More active ransomware groups in 2025 vs prior year

IBM X-Force 2026

78%

Of companies hit by ransomware in the past year

IBM X-Force 2026

5 days

Median attacker dwell time inside a network — down from 9

IBM / Trend Micro 2026

$74B

Forecast ransomware damage costs for 2026

GuidePoint Security 2026

Anthropic’s own advice to defenders is urgent but practical: adopt the current frontier models (Opus 4.6 and similar) for vulnerability-finding immediately, shorten patch cycles, automate incident response, and build familiarity with language models as security tools before Mythos-class capabilities become the industry standard. The window where defenders hold an asymmetric advantage is open. It won’t stay open.

§ 07 / The Competition

GPT-5.5 ships publicly. On the U.K. government’s cyber eval, it scores higher than Mythos.

OpenAI released GPT-5.5 on April 23, 2026 — sixteen days after Mythos was announced and two days after the Discord breach story broke. It is publicly available to anyone with an API key. On the U.K. AI Security Institute’s standardized offensive cybersecurity battery, GPT-5.5 scores 71.4% — 2.8 points above Mythos’s 68.6%. GPT-5.5 also completed AISI’s 32-step corporate-network attack simulation in 2 of 10 attempts. Mythos managed 3 of 10 — the only two models ever to finish it.

This sharpens the core question Mythos raises: if GPT-5.5 matches or slightly exceeds Mythos on a government-administered cybersecurity test and is publicly available, what does restricting Mythos actually accomplish? The answer lies in test design. The AISI runs a standardized battery built for cross-model comparison. Anthropic’s internal tests involve real open-source codebases and actual zero-day discovery — and on those, Mythos shows a qualitatively different capability: 181 Firefox exploits, tier-5 control-flow hijack on ten targets, CVEs in codebases nobody had audited in 16–27 years. That capability profile did not appear in the AISI battery. GPT-5.5 received a “High” (not “Critical”) classification under OpenAI’s Preparedness Framework — defined as not yet able to develop zero-day exploits autonomously without human help. That is precisely what separates it from Mythos. For now.

Frontier Model Comparison

Mythos vs. the Frontier — Three Public Benchmarks

SWE-Bench Pro

Hardest coding benchmark · Multi-file real-world GitHub issues

Mythos PreviewAnthropic

77.8%

Claude Opus 4.7Anthropic

64.3%

GPT-5.5OpenAI

58.6%

GPT-5.4OpenAI

57.7%

Claude Opus 4.6Anthropic

53.4%

Source: Scale AI leaderboard · May 2026

GPQA Diamond

198 PhD-level science questions · Biology, chemistry, physics

Mythos PreviewAnthropic

94.6%

Claude Opus 4.7Anthropic

94.2%

Gemini 3.1 ProGoogle

94.1%

GPT-5.4OpenAI

92%

Claude Opus 4.6Anthropic

91.3%

Source: Multiple independent evaluations · April–May 2026

AISI Offensive Cyber Expert Tasks

U.K. AI Security Institute standardized offensive security battery

GPT-5.5OpenAI

71.4%

Mythos PreviewAnthropic

68.6%

GPT-5.4OpenAI

52.4%

Claude Opus 4.7Anthropic

48.6%

Source: AISI evaluation published May 2026 · aisi.gov.uk

Anthropic (latest)

Anthropic (prior)

Anthropic (Opus 4.6)

OpenAI (latest)

OpenAI (prior)

Google

“GPT-5.5 is here, and it's no potato. It narrowly beats Anthropic's Mythos Preview on Terminal-Bench 2.0 — and it's available to the public.”
VentureBeat · April 24, 2026

— Video Coverage

The AI Model TOO POWERFUL to be Released to the Public

Anthropic Mythos explained in 5 minutes

Why Anthropic's Mythos Is Sparking Alarm

Anthropic's Mythos AI Model Accessed by Unauthorized Users

GPT-5.5 Drops + Anthropic's Mythos Gets Breached — AI News in 5

— Social Reactions

Anthropic (official) — Dario Amodei statement on discussions with the Department of War · April 2026

FT Weekend — Dario Amodei: 'I don't want AI turned on our own people'