GPT-5.5 matches Claude Mythos in cyber attack tests: Report

infosecbulletin Friday , May 1 2026 Hot Topic

Key Points:

The UK’s AI Security Institute (AISI) tested OpenAI’s GPT-5.5 and found it can perform cyberattacks like Anthropic’s Claude Mythos Preview.

CISA alerts to cyberattacks affecting U.S. water utilities

By infosecbulletin / Saturday , August 1 2026

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) warns of a big rise in attacks on internet-connected programmable logic controllers...

CISA alerts to cyberattacks affecting U.S. water utilities

“CyberStrike” AI-Driven Security Platform for Automated Testing

By infosecbulletin / Friday , July 31 2026

A new open-source project named CyberStrike aims to be the first AI tool made for offensive security. It can turn...

“CyberStrike” AI-Driven Security Platform for Automated Testing

AIDCQ Propose to invest $2 billion in AI data center in Bangladesh

By infosecbulletin / Friday , July 31 2026

Many countries are now showing interest to invest in the data center industry in Banglades especially in AI data centers....

AIDCQ Propose to invest $2 billion in AI data center in Bangladesh

NVIDIA BlueField Flaw Enables Code Execution Attacks

By infosecbulletin / Thursday , July 30 2026

NVIDIA has revealed a big flaw with its BlueField DPUs and ConnectX networking systems. This issue could let attackers run...

NVIDIA BlueField Flaw Enables Code Execution Attacks

Massive customer data from India’s Bank of Baroda surfaced online

By infosecbulletin / Wednesday , July 29 2026

India's leading state-owned lender Bank of Baroda acknowledged Monday a security incident after reports that approximately 1 terabyte of customer...

Massive customer data from India’s Bank of Baroda surfaced online

Active Exploits Hit Fortinet, Arista: AI Discovered Linux Kernel Zero-Day

By infosecbulletin / Tuesday , July 28 2026

CISA has put the Fortinet FortiOS vulnerability CVE-2025-68686 in its list of known exploited flaws after ongoing attacks. The flaw...

Active Exploits Hit Fortinet, Arista: AI Discovered Linux Kernel Zero-Day

Sam Altman Claims AI “singularity” has arrived, Where Systems Improve by Themselves

By infosecbulletin / Tuesday , July 28 2026

OpenAI's CEO Sam Altman says that AI has reached a big milestone. The technology can now make itself better, leading...

Sam Altman Claims AI “singularity” has arrived, Where Systems Improve by Themselves

Shinyhunters claimed and set deadline to publish E&Y data

By infosecbulletin / Tuesday , July 28 2026

ShinyHunters has publicly claimed responsibility for the Ernst & Young (EY) data breach. The group posted a message on their...

Shinyhunters claimed and set deadline to publish E&Y data

Microsoft, NVIDIA and CrowdStrike Initiate Alliance for Open-Source AI Security

By infosecbulletin / Monday , July 27 2026

Nvidia and over 30 tech firms started a group on Monday to create open-source AI tools for protecting against cyber...

Microsoft, NVIDIA and CrowdStrike Initiate Alliance for Open-Source AI Security

Google Search Results Reportedly Show Claude AI Shared Chats

By infosecbulletin / Monday , July 27 2026

Claude's share links from Anthropic showed up in public search results. This raised new privacy worries for users who shared...

Google Search Results Reportedly Show Claude AI Shared Chats

GPT-5.5 is the second model, after Mythos, to fully complete a complicated enterprise attack test. This was done on a network without any active defenses.

AISI sees this as part of a larger trend: skills for cyberattacks are growing from general AI advances in areas like independence and coding, not from specific training.

…………………………………………………………………………………………………………………………….

OpenAI’s GPT-5.5 matches Anthropic’s Claude Mythos Preview in cyber tests by the UK AI Security Institute. The agency believes this shows a bigger trend in AI attack skills.

The UK AI Security Institute tested OpenAI’s GPT-5.5 with many cyberattack challenges. The key point: GPT-5.5 is the second model after Claude Mythos Preview to finish a complex test of a business attack. For some expert security tasks, GPT-5.5 did better than Anthropic’s model.

AISI sees that the abilities noticed in Claude Mythos in April are not just a one-time thing. They come from larger improvements in independence, thinking, and coding.

GPT-5.5 edges out Claude Mythos on isolated expert tasks

AISI tests AI models using 95 capture-the-flag tasks at four difficulty levels. The harder tasks were created with help from cybersecurity companies Crystal Peak Security and Irregular. They include reverse engineering, creating exploits for different memory problems, cryptographic attacks, and unpacking hidden malware.
At the hardest “Expert” level, GPT-5.5 has an average success rate of 71.4 percent, according to AISI. Claude Mythos Preview has 68.6 percent. The difference is small, but GPT-5.5 might be the best model tested so far. For comparison, GPT-5.4 scored 52.4 percent and Claude Opus 4.7 got 48.6 percent. Every top model has completely solved the basic tasks since at least February 2026.

After Mythos, GPT-5.5 also cracks a full network attack simulation

Isolated tasks check single skills, but real attacks need many steps linked together. To show this, AISI uses cyber ranges: fake network settings with many hosts, services, and weaknesses.

The simulation “The Last Ones” (TLO) has 32 steps in four subnets and around 20 hosts. The AI agent begins without any credentials. It must find weaknesses, steal credentials, move through the network, and finally reach a secured database. AISI thinks it would take a human expert about 20 hours.

GPT-5.5 solved TLO in 2 out of 10 tries. Claude Mythos Preview managed it in 3 out of 10. AISI says performance improves with more computing power, and top models are still getting better. The more tokens the model uses to “think,” the better its chances are for a successful hack.

The tests had no defenders, no security checks, and no punishments for actions that would trigger alarms in real life. It is unclear if GPT-5.5 or Mythos can compete with strong systems. However, they clearly have power against weak networks.

A second test called “Cooling Tower” shows an attack on an industrial control system. GPT-5.5 could not solve it. No model has completed this 7-step challenge yet. AISI says that GPT-5.5, like Mythos, made mistakes in the upstream IT steps instead of the control system itself.

A universal jailbreak bypassed every safeguard

AISI checked how safe GPT-5.5 is for people. The researchers found a way to bypass safety that worked on all harmful cyber requests OpenAI marked, even complex ones. It took only six hours to create.

OpenAI made many updates to the safety system, but AISI couldn’t check how well the final setup worked because of a problem with the version used. This shows again that jailbreaks are still a big security problem in LLMs, even the best ones.

One main difference from Mythos is that GPT-5.5 can be used in ChatGPT and via the API, but Anthropic keeps Claude Mythos for a small group only. The AISI results show that Anthropic might have been too careful. Or maybe the critics are right, and the slow release is more about Anthropic’s computing limits than safety concerns.

InfoSecBulletin Cybersecurity for mankind

GPT-5.5 matches Claude Mythos in cyber attack tests: Report

CISA alerts to cyberattacks affecting U.S. water utilities

“CyberStrike” AI-Driven Security Platform for Automated Testing

AIDCQ Propose to invest $2 billion in AI data center in Bangladesh

NVIDIA BlueField Flaw Enables Code Execution Attacks

Massive customer data from India’s Bank of Baroda surfaced online

Active Exploits Hit Fortinet, Arista: AI Discovered Linux Kernel Zero-Day

Sam Altman Claims AI “singularity” has arrived, Where Systems Improve by Themselves

Shinyhunters claimed and set deadline to publish E&Y data

Microsoft, NVIDIA and CrowdStrike Initiate Alliance for Open-Source AI Security

Google Search Results Reportedly Show Claude AI Shared Chats

Related Articles

Check Also

Shinyhunters claimed and set deadline to publish E&Y data