A Statement on Project Glasswing

On April 7, a frontier AI model found a vulnerability in OpenBSD that had been sitting there for 27 years. In minutes. Not a research team working for months. Not a fuzzer running for years. A model, pointed at the code, that found what everyone else missed. Then it found a 16-year-old flaw in FFmpeg, in a line of code that automated testing tools had executed 5 million times without catching. Then chained Linux kernel exploits that give an attacker complete control of a system. Then thousands more zero-days across every major operating system and every major browser. This was Project Glasswing. Anthropic gave twelve of the largest technology companies on earth, including AWS, Apple, CrowdStrike, Google, Microsoft, and Palo Alto Networks, access to Claude Mythos Preview. Its job was to find vulnerabilities in the software the world runs on. It did. At a scale and speed that should make every security leader stop and think about what comes next.

Key takeaways:

An AI model found a 27-year-old vulnerability in minutes that humans and automated tools missed for decades. It found thousands more across every major OS and browser. Fewer than 1% have been patched.
This capability will not stay restricted. Within 12 to 18 months, AI-driven vulnerability discovery will be broadly accessible, including to threat actors.
The security model built around preventing known threats is fundamentally inadequate for a world where AI generates unknown threats on demand. The weight shifts to detection and response.
Cybersecurity is entering a new era. The organizations that move to autonomous defense now will survive it. The ones that wait will learn the hard way.

What Just Changed

Let me be specific about why this matters. This is not another AI benchmark story. This is a fundamental shift in what is possible.

Mythos scored 83.1% on the CyberGym vulnerability reproduction benchmark. Claude Opus 4.6 scored 66.6%. Anthropic says the model surpasses "all but the most elite human researchers" at finding and exploiting software vulnerabilities. Based on what it found, I do not think that is an exaggeration.

The thing is, these are not obscure projects with no security investment. OpenBSD, the Linux kernel, FFmpeg, major browsers. Dedicated security teams, decades of code review, millions of dollars in fuzzing infrastructure. One model, running for a few weeks, found what all of that missed.

If the most scrutinized software on the planet has this many undiscovered vulnerabilities, the total vulnerability surface across all software, across every enterprise, every SaaS product, every open-source dependency in your stack, is orders of magnitude larger than anyone assumed.

Fewer than 1% of the vulnerabilities Mythos found have been patched. Discovery just went to AI speed. Remediation is still at human speed. That gap is where every organization on earth is now exposed.

The Future This Creates

Mythos Preview is restricted today. Twelve partners, a verification program, responsible governance. Anthropic structured this carefully, and I give them credit for that.

But here is what I do think the industry needs to confront honestly: this capability will not stay restricted.

GPT-5.4 already has strong vulnerability-finding capabilities. Simon Willison pointed this out publicly, and he is right. The techniques that make this work, multi-step reasoning, deep code understanding, environment interaction, are being replicated across every major AI lab and increasingly in open-weight models. Not at the same performance level yet. But the trajectory is clear, and it is steep.

Offensive capabilities have always propagated faster than defensive ones. That has been true since the first exploit was shared on a forum. AI does not change that pattern. It accelerates it.

So here is the future we need to prepare for. Within 12 to 18 months, the ability to point an AI model at a piece of software and find exploitable vulnerabilities will not be limited to twelve companies with restricted access. It will be available to anyone with sufficient compute. Including threat actors.

Consider what happened with MOVEit. The CL0P ransomware group exploited one known vulnerability in a file transfer tool and compromised over 2,500 organizations across more than 30 countries. No AI involved. Just one vulnerability, automated at scale. Now imagine that same playbook, except the threat actor does not need to wait for a CVE. They point a model at the software, find a zero-day in minutes, and begin exploitation immediately.

That is not science fiction. That is the world Glasswing just previewed.

How the Industry Has to Change

The speed gap between offense and defense was already bad before Glasswing. CrowdStrike's 2024 Global Threat Report measured the average eCrime breakout time at 62 minutes, down from 84 minutes the year before. The fastest recorded breakout was 2 minutes and 7 seconds. Palo Alto Networks Unit 42 reported that in 45% of their incident response cases, attackers exfiltrated data within a single day of compromise.

Now layer AI-driven vulnerability discovery on top of that. The starting point for attackers gets dramatically faster. The volume of exploitable entry points multiplies. And the defensive playbook that most organizations are running, built around preventing known threats from getting in, stops working.

This is not an incremental change. The security model that the industry has operated under for decades is built on an assumption: that threats are known, that someone has discovered the vulnerability, written a CVE, published a patch, and your team has applied it before the threat actor arrives. When AI can generate unknown vulnerabilities on demand, that assumption is gone. It comes down to how quickly you can detect and how effectively you can respond. Everything else is secondary.

CISOs need to rethink how they allocate budget, how they structure their teams, and what they expect their tools to do. Security vendors need to stop selling incremental improvements to a model that is about to be obsolete. And the industry as a whole needs to have an honest conversation about what defense looks like when the attacker has capabilities that did not exist two years ago.

The Five Pillars and Why One of Them Really Matters Now

I have been saying this for two years, in every room with a CISO, at every conference, in every conversation where someone would listen. Glasswing just proved it with data nobody can argue with.

Cybersecurity fundamentally breaks down into five pillars. I use this framework because it forces clarity about where you are actually spending your time and money.

Visibility: knowing what you have. Protection: hardening it. Detection and Response: catching and stopping threats. Business Continuity and Disaster Recovery: surviving the worst case. Governance, Risk, and Compliance: meeting your obligations.

For decades, the industry poured the majority of its budget into pillar two. Protection. Patching, signatures, firewall rules, access controls, hardening guides. That made sense when the threats were known.

When a threat actor points a frontier model at your infrastructure and finds a zero-day in three minutes, there is no CVE. There is no patch. There is no signature. There is nothing to protect against, because the vulnerability was not in any database until the model found it.

“Protection doesn't work when the vulnerability doesn't exist yet. It comes down to how quickly can we detect and how effectively can we respond.”

, David Colombo, Founder & CEO, Alaris

The entire weight of defense shifts to pillar three. Not as a nice-to-have. As the only thing that stands between a threat actor and their objective when the vulnerability they are exploiting has never been seen before.

What Comes Next

Compare a traditional SOC workflow to what we are now facing. An analyst sees an alert in the SIEM. Pivots to the XDR for endpoint context. Opens a case in the ticketing system. Checks threat intelligence in a separate feed. Triggers a containment action through the SOAR platform. That process takes 30 to 90 minutes for a single alert. Meanwhile, the attacker found the vulnerability in minutes and has been inside the environment the entire time.

When exploitation operates at AI speed, human-speed detection and response is not slow. It is irrelevant.

That is why we built Autonomous Security Operations. Not because it is a nice category to own. Because the threat landscape now demands detection and response that operates at machine speed, autonomously, across the full lifecycle. Detection engineering, triage, investigations and incident management, threat hunting, containment and response, reporting. All six stages, on one architecture, at machine speed. Not waiting for a human to carry context between five different browser tabs.

The argument for autonomous security operations was already strong. A 4.8 million person talent shortage. Alert volumes that exceed human capacity. Threat actors with sub-hour breakout times. Glasswing made the timeline urgent. When frontier AI can find thousands of zero-days in weeks, the window where human-speed defense is viable is closing fast.

Cybersecurity is entering a new era. The organizations that move to autonomous defense now will be ready for it. The ones that wait will learn the hard way.

Sources

Frequently Asked Questions

See It Live

Stop reading comparisons. Run one.

The interactive demo lets you run a live attack simulation, with Alaris, without Alaris, and against competitors, in real time.

Try the Interactive Demo

David Colombo

Founder & CEO, Alaris

David Colombo is the CEO and Co-Founder of Alaris, the company pioneering Autonomous Security Operations. Before founding Alaris, David gained international recognition for his cybersecurity research, including the discovery of vulnerabilities affecting Tesla vehicles worldwide. He is based in San Francisco.

IndustryWe're Giving Script Kiddies Nuclear Weapons

1 min read

IndustryASO vs AI SOC

12 min read

Platform

Use Cases

Resources

Company