Saturday , June 21 2025
Qwen

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI’s ChatGPT, DeepSeek, and Alibaba’s Qwen.

After its launch, the open-source R1 model by Chinese company DeepSeek caught the attention of the cybersecurity industry. Experts found that jailbreak methods, previously patched in other AI models, still function against DeepSeek.

Russia detects first SuperCard malware attacks via NFC

Russian cybersecurity experts discovered the first local data theft attacks using a modified version of legitimate near field communication (NFC)...
Read More
Russia detects first SuperCard malware attacks via NFC

Income Property Investments exposes 170,000+ Individuals record

Cybersecurity researcher Jeremiah Fowler discovered an unsecured database with 170,360 records belonging to a real estate company. It contained personal...
Read More
Income Property Investments exposes 170,000+ Individuals record

ALERT (CVE: 2023-28771)
Zyxel Firewalls Under Attack via CVE-2023-28771 by 244 IPs

GreyNoise found attempts to exploit CVE-2023-28771, a vulnerability in Zyxel's IKE affecting UDP port 500. The attack centers around CVE-2023-28771,...
Read More
ALERT (CVE: 2023-28771)  Zyxel Firewalls Under Attack via CVE-2023-28771 by 244 IPs

CISA Flags Active Exploits in Apple iOS and TP-Link Routers

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has recently included two high-risk vulnerabilities in its Known Exploited Vulnerabilities (KEV)...
Read More
CISA Flags Active Exploits in Apple iOS and TP-Link Routers

10K Records Allegedly from Mac Cloud Provider’s Customers Leaked Online

SafetyDetectives’ Cybersecurity Team discovered a public post on a clear web forum in which a threat actor claimed to have...
Read More
10K Records Allegedly from Mac Cloud Provider’s Customers Leaked Online

Canada 2nd largest airlines “WestJet” investigates cyberattack disrupting internal systems

WestJet, Canada's second-largest airline, is looking into a cyberattack that has affected some internal systems during its response to the...
Read More
Canada 2nd largest airlines “WestJet” investigates cyberattack disrupting internal systems

Paraguay 7.4 Million Citizen Records Leaked on Dark Web

Resecurity found 7.4 million records of Paraguayan citizens' personal information leaked on the dark web today. Last week, cybercriminals attempted...
Read More
Paraguay 7.4 Million Citizen Records Leaked on Dark Web

High-Severity Flaw in HashiCorp Nomad Allows Privilege Escalation

HashiCorp has revealed a critical vulnerability in its Nomad tool that may let attackers gain higher privileges by misusing the...
Read More
High-Severity Flaw in HashiCorp Nomad Allows Privilege Escalation

SoftBank: Over 137,000 personal info leaked

SoftBank has disclosed that personal information of more than 137,000 mobile subscribers—covering names, addresses, and phone numbers—might have been leaked...
Read More
SoftBank: Over 137,000 personal info leaked

Alert
Trend Micro Apex One Flaw Allow Attackers to Inject Malicious Code

Serious security vulnerabilities in Trend Micro Apex One could allow attackers to inject malicious code and elevate their privileges within...
Read More
Alert  Trend Micro Apex One Flaw Allow Attackers to Inject Malicious Code

AI jailbreaking allows attackers to bypass safeguards designed to stop LLMs from producing harmful content. Security researchers have demonstrated that methods like prompt injection and model manipulation can overcome these protections.

Threat intelligence firm Kela found that DeepSeek is affected by Evil Jailbreak, where a chatbot is made to act as an evil confidant, and Leo, which allows the chatbot to take on an unrestricted persona. ChatGPT has fixed these vulnerabilities.

Palo Alto Networks’ Unit42 reported that DeepSeek is vulnerable to known AI jailbreak techniques.

The security firm successfully conducted the attack known as Deceptive Delight, which tricks generative AI models by embedding unsafe or restricted topics in benign narratives. This method was tested in the fall of 2024 against eight LLMs with an average success rate of 65%.

Palo Alto has successfully executed the Bad Likert Judge jailbreak, which asks the LLM to evaluate the harmfulness of responses using a Likert scale and generate examples that fit the scale.

Researchers discovered that DeepSeek is vulnerable to Crescendo, a jailbreak method that begins with innocuous dialogue and gradually shifts towards forbidden topics.

Alibaba has announced a new version of its Qwen AI model, claiming it outperforms the DeepSeek model.

Kela announced on Thursday that Alibaba’s new Qwen 2.5-VL model has vulnerabilities similar to those recently discovered in DeepSeek.

Researchers at a threat intelligence firm found that jailbreaks designed for DeepSeek also work on Qwen. They successfully tested an existing jailbreak called Grandma, which tricks the model into sharing dangerous information by having it role-play as a grandmother.

Kela found that Qwen 2.5-VL created content about developing ransomware and other malware.

“The ability of AI models to produce infostealer malware instructions raises serious concerns, as cybercriminals could leverage these capabilities to automate and enhance their attack methodologies,” Kela said.

Many jailbreak methods for ChatGPT have been fixed over the years, but researchers still discover new ways to bypass its protections.

CERT/CC reported that researcher Dave Kuszmar found a ChatGPT-4o jailbreak vulnerability called Time Bandit. This vulnerability allows users to ask the AI about specific historical events or instruct it to pretend to assist in such events.

“The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly,” CERT/CC explained in an advisory. “Once this historical timeframe has been established in the ChatGPT conversation, the attacker can exploit timeline confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.”

Check Also

CCTV

Why Govt Demands Foreign CCTV Firms to Submit Source Code?

Global makers of surveillance gear have clashed with Indian regulators in recent weeks over contentious …

Leave a Reply

Your email address will not be published. Required fields are marked *