ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

infosecbulletin Sunday , February 2 2025 Hot Topic

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI’s ChatGPT, DeepSeek, and Alibaba’s Qwen.

After its launch, the open-source R1 model by Chinese company DeepSeek caught the attention of the cybersecurity industry. Experts found that jailbreak methods, previously patched in other AI models, still function against DeepSeek.

Microsoft Confirms 900+ XSS Vulns Found in IT Services

By infosecbulletin / Tuesday , September 16 2025

Cross-Site Scripting (XSS) is one of the oldest and most persistent vulnerabilities in modern applications. Despite being recognized for over...

Microsoft Confirms 900+ XSS Vulns Found in IT Services

Daily Security Update Dated : 15.09.2025

By infosecbulletin / Monday , September 15 2025

Every day a lot of cyberattack happen around the world including ransomware, Malware attack, data breaches, website defacement and so...

Daily Security Update Dated : 15.09.2025

IBM QRadar SIEM Vuln Let Attackers Perform Unauthorized Actions

By infosecbulletin / Monday , September 15 2025

A critical permission misconfiguration in the IBM QRadar Security Information and Event Management (SIEM) platform could allow local privileged users...

IBM QRadar SIEM Vuln Let Attackers Perform Unauthorized Actions

Major Australian Banks using Army of AI Bots to Scam Scammers

By infosecbulletin / Monday , September 15 2025

Australian banks are now using bots to combat scammers. These bots mimic potential victims to gather real-time information and drain...

Major Australian Banks using Army of AI Bots to Scam Scammers

F5 to acquire CalypsoAI for $180M for Advanced AI Security Capabilities

By infosecbulletin / Saturday , September 13 2025

F5 plans to acquire CalypsoAI, which offers adaptive AI security solutions. CalypsoAI's technology will be added to F5's Application Delivery...

F5 to acquire CalypsoAI for $180M for Advanced AI Security Capabilities

AI Pentesting Tool ‘Villager’ Merges Kali Linux with DeepSeek AI for Automated Attacks

By infosecbulletin / Saturday , September 13 2025

The Villager framework, an AI-powered penetration testing tool, integrates Kali Linux tools with DeepSeek AI to automate cyber attack processes....

AI Pentesting Tool ‘Villager’ Merges Kali Linux with DeepSeek AI for Automated Attacks

CVE-2025-21043
Samsung Patched Critical Zero-Day Flaw Exploited in Android Attacks

By infosecbulletin / Saturday , September 13 2025

Samsung released its monthly Android security updates, addressing a vulnerability exploited in zero-day attacks. CVE-2025-21043 (CVSS score: 8.8) is a...

CVE-2025-21043 Samsung Patched Critical Zero-Day Flaw Exploited in Android Attacks

Albania appoints world’s first AI minister, “Diella” to Tackle Corruption

By infosecbulletin / Saturday , September 13 2025

Albania has appointed the first AI-generated government minister to help eliminate corruption. Diella, the digital assistant meaning Sun, has been...

Albania appoints world’s first AI minister, “Diella” to Tackle Corruption

L7 DDoS Botnet Hijacked 5.76M Devices for Large Attacks

By infosecbulletin / Thursday , September 11 2025

On September 1, 2025, Qrator Lab identified and managed a major attack from the largest L7 DDoS botnet seen so...

L7 DDoS Botnet Hijacked 5.76M Devices for Large Attacks

Palo Alto Networks User-ID Credential Agent Vuln Exposes password In Cleartext

By infosecbulletin / Thursday , September 11 2025

A new vulnerability, CVE-2025-4235, in Palo Alto Networks’ User-ID Credential Agent for Windows, could reveal a service account's password in...

Palo Alto Networks User-ID Credential Agent Vuln Exposes password In Cleartext

AI jailbreaking allows attackers to bypass safeguards designed to stop LLMs from producing harmful content. Security researchers have demonstrated that methods like prompt injection and model manipulation can overcome these protections.

Threat intelligence firm Kela found that DeepSeek is affected by Evil Jailbreak, where a chatbot is made to act as an evil confidant, and Leo, which allows the chatbot to take on an unrestricted persona. ChatGPT has fixed these vulnerabilities.

Palo Alto Networks’ Unit42 reported that DeepSeek is vulnerable to known AI jailbreak techniques.

The security firm successfully conducted the attack known as Deceptive Delight, which tricks generative AI models by embedding unsafe or restricted topics in benign narratives. This method was tested in the fall of 2024 against eight LLMs with an average success rate of 65%.

Palo Alto has successfully executed the Bad Likert Judge jailbreak, which asks the LLM to evaluate the harmfulness of responses using a Likert scale and generate examples that fit the scale.

Researchers discovered that DeepSeek is vulnerable to Crescendo, a jailbreak method that begins with innocuous dialogue and gradually shifts towards forbidden topics.

Alibaba has announced a new version of its Qwen AI model, claiming it outperforms the DeepSeek model.

Kela announced on Thursday that Alibaba’s new Qwen 2.5-VL model has vulnerabilities similar to those recently discovered in DeepSeek.

Researchers at a threat intelligence firm found that jailbreaks designed for DeepSeek also work on Qwen. They successfully tested an existing jailbreak called Grandma, which tricks the model into sharing dangerous information by having it role-play as a grandmother.

Kela found that Qwen 2.5-VL created content about developing ransomware and other malware.

“The ability of AI models to produce infostealer malware instructions raises serious concerns, as cybercriminals could leverage these capabilities to automate and enhance their attack methodologies,” Kela said.

Many jailbreak methods for ChatGPT have been fixed over the years, but researchers still discover new ways to bypass its protections.

CERT/CC reported that researcher Dave Kuszmar found a ChatGPT-4o jailbreak vulnerability called Time Bandit. This vulnerability allows users to ask the AI about specific historical events or instruct it to pretend to assist in such events.

“The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly,” CERT/CC explained in an advisory. “Once this historical timeframe has been established in the ChatGPT conversation, the attacker can exploit timeline confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.”

InfoSecBulletin Cybersecurity for mankind

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

Microsoft Confirms 900+ XSS Vulns Found in IT Services

Daily Security Update Dated : 15.09.2025

IBM QRadar SIEM Vuln Let Attackers Perform Unauthorized Actions

Major Australian Banks using Army of AI Bots to Scam Scammers

F5 to acquire CalypsoAI for $180M for Advanced AI Security Capabilities

AI Pentesting Tool ‘Villager’ Merges Kali Linux with DeepSeek AI for Automated Attacks

CVE-2025-21043
Samsung Patched Critical Zero-Day Flaw Exploited in Android Attacks

Albania appoints world’s first AI minister, “Diella” to Tackle Corruption

L7 DDoS Botnet Hijacked 5.76M Devices for Large Attacks

Palo Alto Networks User-ID Credential Agent Vuln Exposes password In Cleartext

Related Articles

Check Also

L7 DDoS Botnet Hijacked 5.76M Devices for Large Attacks