Wednesday , June 24 2026

OpenAI Unveils EVMbench to Detect, Patch, and Exploit Vulns in Blockchain Environments

OpenAI along with crypto firm Paradigm have launched EVMbench, a benchmark to assess AI agents’ skills in identifying, fixing, and exploiting serious vulnerabilities in smart contracts.

EVMbench includes 120 curated vulnerabilities from 40 security audits, primarily from open code audit competitions on platforms like Code4rena.

LastPass says hackers stole customer data via Klue, supply chain breach

LastPass has reported a security issue with its vendor, Klue. This incident allowed an attacker unauthorized access to customer data....
Read More
LastPass says hackers stole customer data via Klue, supply chain breach

New Apple Exploit Bypasses Boot Defenses, Possibly Affects Millions of iPhones Worldwide

Researchers at cybersecurity firm Paradigm Shift found a new flaw called usbliter8. This flaw can get around main boot protections...
Read More
New Apple Exploit Bypasses Boot Defenses, Possibly Affects Millions of iPhones Worldwide

India’s Tata Electronics hit by cyber breach: Hacker target 630 GB record

A cyber attack seems to have affected one of India's top electronics companies. Tata Electronics has said there was a...
Read More
India’s Tata Electronics hit by cyber breach: Hacker target 630 GB record

Anthropic’s Mythos reportedly broke NSA classified systems in hours

The recent finding shows how powerful Mythos is: the AI can access the US government's secret networks in just a...
Read More
Anthropic’s Mythos reportedly broke NSA classified systems in hours

OpenAI New Method “Deployment Simulation” Predicts AI Risks Before Deployment

Test before going live is important for AI developers. But there's a problem: testing usually uses fake scenarios that often...
Read More
OpenAI New Method “Deployment Simulation” Predicts AI Risks Before Deployment

AryStinger botnet infected thousands of D-Link routers globally

AryStinger has taken control of over 4,000 old D-Link routers to use them as proxies for harmful traffic. The team...
Read More
AryStinger botnet infected thousands of D-Link routers globally

Hacker suspected of sending alerts across Brazil

Brazil's government suspects a hacking attack triggered an unauthorized ‌alert sent to cell phones across parts of the country early...
Read More
Hacker suspected of sending alerts across Brazil

CyberSentinel AI features 33 security tools like Nmap, SQLMap, and ZAP, utilizing Claude and GPT

A new open-source cybersecurity tool named CyberSentinel AI v3.0 has come out. It is an important step in self-operated security...
Read More
CyberSentinel AI features 33 security tools like Nmap, SQLMap, and ZAP, utilizing Claude and GPT

Barracuda hosts Dhaka roundtable on cyber resilience

Barracuda gathered industry people in Dhaka on 18 June 2026 for a roundtable talk about cyber resilience. The company shared...
Read More
Barracuda hosts Dhaka roundtable on cyber resilience

CISA Alerts Fortinet Users as FortiBleed Affects 86,644 FortiGate Devices

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) asked Fortinet users with FortiGate devices on Thursday to act to protect...
Read More
CISA Alerts Fortinet Users as FortiBleed Affects 86,644 FortiGate Devices

Three Evaluation Modes:

EVMbench tests AI agents in three different capability modes, each focused on a unique stage of the smart contract security lifecycle.

Detect: Agents audit a smart contract repository and are scored on recall of ground-truth vulnerabilities and associated audit rewards.
Patch: Agents modify vulnerable contracts and must preserve intended functionality while eliminating exploitability, verified through automated tests and exploit checks.
Exploit: Agents execute end-to-end fund-draining attacks against deployed contracts on a sandboxed blockchain environment, with grading performed programmatically via transaction replay and on-chain verification.

OpenAI created a Rust-based tool for consistent evaluation, deploying contracts in a controlled manner and preventing unsafe RPC methods. All exploitation tasks occur in a secure local Anvil environment, not on live networks.

EVMbench shows that the Frontier model performs differently across tasks. In exploit mode, GPT-5.3-Codex scored 72.2%, a significant increase from GPT-5’s 31.9% six months ago.

Limitation:

Open AI stated that EVMbench does not fully capture the challenges of real-world smart contract security. The vulnerabilities it includes come from Code4rena audits. Although they are realistic and severe, many heavily used crypto contracts face even more scrutiny and might be tougher to exploit.

OpenAI has allocated $10 million in API credits for its Cybersecurity Grant Program to boost research in defensive security, focusing on open-source software and critical infrastructure.

Check Also

Rokarolla

New Rokarolla Android malware hits 217 banking and crypto apps

A new Android banking trojan called Rokarolla is hitting 217 banking and cryptocurrency apps with …