OpenAI Unveils EVMbench to Detect, Patch, and Exploit Vulns in Blockchain Environments

infosecbulletin Thursday , February 19 2026 International

OpenAI along with crypto firm Paradigm have launched EVMbench, a benchmark to assess AI agents’ skills in identifying, fixing, and exploiting serious vulnerabilities in smart contracts.

EVMbench includes 120 curated vulnerabilities from 40 security audits, primarily from open code audit competitions on platforms like Code4rena.

Meta’s louisiana data center to exceed 250 billion price tag

By infosecbulletin / Tuesday , July 14 2026

Meta announced on Monday that its data center in Richland Parish, Louisiana, will grow to 5 gigawatts of computing power....

Meta’s louisiana data center to exceed 250 billion price tag

Ransomware Crisis in 2026: 5,064 Organizations Affected in 135 Countries

By infosecbulletin / Sunday , July 12 2026

Global ransomware attacks stayed very high in the first seven months of 2026. There were 5,064 confirmed victims in 135...

Ransomware Crisis in 2026: 5,064 Organizations Affected in 135 Countries

Palo Alto Networks Addresses 13 Vulnerabilities

By infosecbulletin / Sunday , July 12 2026

Palo Alto Networks shared warnings on Wednesday about over twelve security issues in its products. The new warnings include 13 security...

Palo Alto Networks Addresses 13 Vulnerabilities

Critical Dell BIOS & Zimbra Flaws Expose Enterprise Systems

By infosecbulletin / Sunday , July 12 2026

A critical flaw with how Dell saves BIOS passwords lets anyone quickly recover these passwords from a flash dump without...

Critical Dell BIOS & Zimbra Flaws Expose Enterprise Systems

CoLoCity Launches New 1.0 MW Data Center Facility at Gulshan

By infosecbulletin / Saturday , July 11 2026

CoLoCity is proud to launch a new Data Center in Gulshan-2. It is designed to meet the growing demand for...

CoLoCity Launches New 1.0 MW Data Center Facility at Gulshan

Daily Cyber security update for 10. 07. 2026

By infosecbulletin / Friday , July 10 2026

Cyberattacks are rising around the world, including ransomware, malware, data leaks, and hacked websites. These events show how complex and...

Daily Cyber security update for 10. 07. 2026

How Hacker Compromise AWS Cloud Environment Using AI in 72 Hours

By infosecbulletin / Friday , July 10 2026

A major AWS attack shows how attackers with AI can connect known cloud strategies to go from first access to...

How Hacker Compromise AWS Cloud Environment Using AI in 72 Hours

Mycelium Framework: First AI-as-a-Service Botnet

By infosecbulletin / Thursday , July 9 2026

A new cybercrime ad is catching attention in the security world. It talks about a botnet that doesn't just get...

Mycelium Framework: First AI-as-a-Service Botnet

CrowdStrike Shows 5 New Prompt Injection Techniques for AI Agents

By infosecbulletin / Wednesday , July 8 2026

CrowdStrike has shared five new ways to inject prompts, showing the rising danger to AI agents as more organizations use...

CrowdStrike Shows 5 New Prompt Injection Techniques for AI Agents

Critical GCP Dialogflow Vulnerability Allows Malicious Code Injection

By infosecbulletin / Wednesday , July 8 2026

A critical flaw in Google Cloud Platform’s Dialogflow CX lets attackers add harmful code to a company's AI chatbot system....

Critical GCP Dialogflow Vulnerability Allows Malicious Code Injection

Three Evaluation Modes:

EVMbench tests AI agents in three different capability modes, each focused on a unique stage of the smart contract security lifecycle.

Detect: Agents audit a smart contract repository and are scored on recall of ground-truth vulnerabilities and associated audit rewards.
Patch: Agents modify vulnerable contracts and must preserve intended functionality while eliminating exploitability, verified through automated tests and exploit checks.
Exploit: Agents execute end-to-end fund-draining attacks against deployed contracts on a sandboxed blockchain environment, with grading performed programmatically via transaction replay and on-chain verification.

OpenAI created a Rust-based tool for consistent evaluation, deploying contracts in a controlled manner and preventing unsafe RPC methods. All exploitation tasks occur in a secure local Anvil environment, not on live networks.

EVMbench shows that the Frontier model performs differently across tasks. In exploit mode, GPT-5.3-Codex scored 72.2%, a significant increase from GPT-5’s 31.9% six months ago.

Limitation:

Open AI stated that EVMbench does not fully capture the challenges of real-world smart contract security. The vulnerabilities it includes come from Code4rena audits. Although they are realistic and severe, many heavily used crypto contracts face even more scrutiny and might be tougher to exploit.

OpenAI has allocated $10 million in API credits for its Cybersecurity Grant Program to boost research in defensive security, focusing on open-source software and critical infrastructure.

InfoSecBulletin Cybersecurity for mankind

OpenAI Unveils EVMbench to Detect, Patch, and Exploit Vulns in Blockchain Environments

Meta’s louisiana data center to exceed 250 billion price tag

Ransomware Crisis in 2026: 5,064 Organizations Affected in 135 Countries

Palo Alto Networks Addresses 13 Vulnerabilities

Critical Dell BIOS & Zimbra Flaws Expose Enterprise Systems

CoLoCity Launches New 1.0 MW Data Center Facility at Gulshan

Daily Cyber security update for 10. 07. 2026

How Hacker Compromise AWS Cloud Environment Using AI in 72 Hours

Mycelium Framework: First AI-as-a-Service Botnet

CrowdStrike Shows 5 New Prompt Injection Techniques for AI Agents

Critical GCP Dialogflow Vulnerability Allows Malicious Code Injection

Related Articles

Check Also

Azure CLI Password Spray Impacts 78 Microsoft Accounts in 81M+ Attempts