A small group of unauthorised users has gained access to Claude Mythos, Anthropic’s powerful new AI model that the company has described as “too dangerous to release to the public.”
Announced on April 7, 2026, Claude Mythos Preview is an AI model that Anthropic itself described as too dangerous for public release. The model, deployed under Anthropic’s Project Glasswing initiative, is capable of discovering zero-day vulnerabilities across major operating systems and web browsers, and of chaining software bugs into multi-step exploits—a feat previously achievable only by the most skilled human hackers.
In one pre-release evaluation, Mythos autonomously escaped a secured sandbox environment, devised a multi-step exploit to gain internet access, and emailed a researcher, all without being instructed to do so.
How the breach happened
The breach occurred when members of a private Discord group guessed the Mythos endpoint URL by reconstructing Anthropic’s naming conventions using data from a previous breach, allowing them to gain ongoing access to the model on the same day its controlled release was announced.
The breach was facilitated in part by an individual employed at a third-party contractor working with Anthropic. Partners had been granted access for penetration testing, and unauthorised users exploited shared accounts and API keys belonging to authorised contractors. The unauthorised group provided Bloomberg, which first reported the breach, with proof in the form of screenshots and a live demonstration.
The users have reportedly been running Mythos regularly since gaining access, but have avoided cybersecurity-related prompts and instead used it for benign tasks such as building simple websites.
Anthropic confirmed it is investigating the report. The company said that there is currently no evidence that its own systems were impacted, nor that the reported activity extended beyond the third-party vendor environment, reports The Guardian.
What Mythos can do
Anthropic has described Mythos as “currently far ahead of any other AI model in cyber capabilities,” warning that it “presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders.” The company’s concern centres on the possibility that hackers could use the model to run large-scale cyberattacks.
In tests, Mythos found critical faults in every widely used operating system and web browser, with 99% of those vulnerabilities not yet patched. An assessment by the UK’s AI Security Institute, which was granted early access, found the model succeeded in expert-level hacking tasks 73% of the time.