Anthropic Keeps Its Latest AI Mythos Behind Closed Doors

Instead of scaling fast, the company is testing its most advanced model within a security-focused consortium.

MITSloan ME Editorial April 08, 2026

Topics

Anthropic on Tuesday announced a tightly controlled preview of “Mythos,” a new artificial intelligence model that the company says is among its most capable to date—and one it is deliberately keeping out of public hands. Instead, the San Francisco–based startup is routing access through a limited consortium of partners under a security initiative called Project Glasswing, reflecting both the promise and the peril of increasingly powerful AI systems in cybersecurity.

The model, formally referred to as “Claude Mythos Preview,” will be deployed by a small group of organizations—including Amazon, Apple, Microsoft, Cisco, CrowdStrike, and The Linux Foundation—to conduct what Anthropic describes as “defensive security work.”

Additional collaborators include Broadcom and Palo Alto Networks, alongside a broader pool of roughly 40 infrastructure-focused organizations granted preview access.

The strategy marks a departure from the conventional model release cycle, where performance gains are typically followed by rapid commercial deployment. Instead, Anthropic is effectively treating Mythos as dual-use infrastructure—capable of strengthening defenses, but also of amplifying offensive cyber capabilities if misapplied.

According to the company, Mythos has already identified “thousands” of previously unknown vulnerabilities across widely used software systems, including operating systems and web browsers. Many of these flaws, Anthropic says, qualify as zero-day vulnerabilities, with some dating back 27 years. The findings highlight a persistent blind spot in software maintenance: legacy codebases that remain embedded in critical infrastructure long after their original threat models have expired.

Mythos was not explicitly trained as a cybersecurity model. Rather, it is a general-purpose system within Anthropic’s Claude family, designed with advanced reasoning and agentic coding capabilities. These attributes appear to translate effectively into vulnerability discovery, allowing the model to autonomously scan both proprietary and open-source codebases for exploitable weaknesses. In internal testing, the company claims, Mythos significantly outperformed its existing “Opus” class models across domains such as software engineering and academic reasoning.

Yet the same capabilities that enable defensive scanning could, in principle, be weaponized. A previously leaked internal document described the model—then codenamed “Capybara”—as “by far the most powerful AI model we’ve ever developed,” while warning of its potential misuse in identifying and exploiting software bugs. The leak, which originated from an unsecured data repository and was first reported by Fortune, highlighted the growing challenge of safeguarding not just AI systems, but also the operational processes around them.

Anthropic’s cautious rollout is therefore as much about governance as it is about technology. Under Project Glasswing, partner organizations are expected to share insights from their use of Mythos, creating a feedback loop intended to benefit the broader cybersecurity ecosystem. The company has also committed up to $100 million in usage credits and $4 million in funding for open-source security initiatives, signaling an effort to align commercial incentives with collective resilience.

The initiative comes amid escalating concern over AI-enabled cyber threats. At this year’s RSA Conference in San Francisco, discussions were dominated by the prospect of automated attacks capable of outpacing traditional defenses. Industry data reflects this shift: a joint study by IBM and Palo Alto Networks found that 67% of surveyed executives reported being targeted by AI-driven attacks in the past year.

Anthropic itself is not insulated from these risks. The company disclosed that vulnerabilities in its Claude systems were previously exploited in attacks on approximately 30 organizations. More recently, it faced a series of operational missteps, including the accidental exposure of nearly 2,000 source code files and the inadvertent takedown of thousands of GitHub repositories during remediation efforts.

Complicating matters further are Anthropic’s ongoing legal dispute with the federal government after the Pentagon reportedly designated it a supply-chain risk, citing Anthropic’s refusal to support certain autonomous surveillance and targeting applications.

Topics

About the Author

Tags:

AI and Cybersecurity Anthropic AI Claude Mythos

Topics

Share