Sunday , June 1 2025
Qwen

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI’s ChatGPT, DeepSeek, and Alibaba’s Qwen.

After its launch, the open-source R1 model by Chinese company DeepSeek caught the attention of the cybersecurity industry. Experts found that jailbreak methods, previously patched in other AI models, still function against DeepSeek.

CISA Issued Guidance for SIEM and SOAR Implementation

CISA and ACSC issued new guidance this week on how to procure, implement, and maintain SIEM and SOAR platforms. SIEM...
Read More
CISA Issued Guidance for SIEM and SOAR Implementation

Linux flaws enable password hash theft via core dumps in Ubuntu, RHEL, Fedora

The Qualys Threat Research Unit (TRU) found two local information-disclosure vulnerabilities in Apport and systemd-coredump. Both issues are race-condition vulnerabilities....
Read More
Linux flaws enable password hash theft via core dumps in Ubuntu, RHEL, Fedora

Australia enacts mandatory ransomware payment reporting

New ransomware payment reporting rules take effect in Australia yesterday (May 30) for all organisations with an annual turnover of...
Read More
Australia enacts mandatory ransomware payment reporting

Why Govt Demands Foreign CCTV Firms to Submit Source Code?

Global makers of surveillance gear have clashed with Indian regulators in recent weeks over contentious new security rules that require...
Read More
Why Govt Demands Foreign CCTV Firms to Submit Source Code?

CVE-2023-39780
Botnet hacks thousands of ASUS routers

GreyNoise has discovered a campaign where attackers have gained unauthorized access to thousands of internet-exposed ASUS routers. This seems to...
Read More
CVE-2023-39780  Botnet hacks thousands of ASUS routers

Bangladesh Bank instructed using AI to prevent online gambling

The rise of online gambling in the country is leading to increased crime and societal issues. In response, the central...
Read More
Bangladesh Bank instructed using AI to prevent online gambling

251 Amazon-Hosted IPs Used in Exploit Scan for ColdFusion, Struts, and Elasticsearch

Cybersecurity researchers recently revealed a coordinated cloud-based scanning attack that targeted 75 different exposure points earlier this month. On May...
Read More
251 Amazon-Hosted IPs Used in Exploit Scan for ColdFusion, Struts, and Elasticsearch

Zero-Trust Policy bypass to Exploit Vulns & Manipulate NHI Secrets

Recent security research has shown that attackers can weaken zero-trust security frameworks by exploiting a key DNS vulnerability, disrupting automated...
Read More
Zero-Trust Policy bypass to Exploit Vulns & Manipulate NHI Secrets

Evaly E-commerce Platform Allegedly Hacked

Evaly, a Bangladeshi e-commerce platform, is reportedly facing a major data breach that may have exposed sensitive information of around...
Read More
Evaly E-commerce Platform Allegedly Hacked

Exploitable Vulns in Canon Printers Allow Gaining Admin Privileges

A passback vulnerability has been found in some Canon printers, including production and multifunction models. If an attacker gains administrative...
Read More
Exploitable Vulns in Canon Printers Allow Gaining Admin Privileges

AI jailbreaking allows attackers to bypass safeguards designed to stop LLMs from producing harmful content. Security researchers have demonstrated that methods like prompt injection and model manipulation can overcome these protections.

Threat intelligence firm Kela found that DeepSeek is affected by Evil Jailbreak, where a chatbot is made to act as an evil confidant, and Leo, which allows the chatbot to take on an unrestricted persona. ChatGPT has fixed these vulnerabilities.

Palo Alto Networks’ Unit42 reported that DeepSeek is vulnerable to known AI jailbreak techniques.

The security firm successfully conducted the attack known as Deceptive Delight, which tricks generative AI models by embedding unsafe or restricted topics in benign narratives. This method was tested in the fall of 2024 against eight LLMs with an average success rate of 65%.

Palo Alto has successfully executed the Bad Likert Judge jailbreak, which asks the LLM to evaluate the harmfulness of responses using a Likert scale and generate examples that fit the scale.

Researchers discovered that DeepSeek is vulnerable to Crescendo, a jailbreak method that begins with innocuous dialogue and gradually shifts towards forbidden topics.

Alibaba has announced a new version of its Qwen AI model, claiming it outperforms the DeepSeek model.

Kela announced on Thursday that Alibaba’s new Qwen 2.5-VL model has vulnerabilities similar to those recently discovered in DeepSeek.

Researchers at a threat intelligence firm found that jailbreaks designed for DeepSeek also work on Qwen. They successfully tested an existing jailbreak called Grandma, which tricks the model into sharing dangerous information by having it role-play as a grandmother.

Kela found that Qwen 2.5-VL created content about developing ransomware and other malware.

“The ability of AI models to produce infostealer malware instructions raises serious concerns, as cybercriminals could leverage these capabilities to automate and enhance their attack methodologies,” Kela said.

Many jailbreak methods for ChatGPT have been fixed over the years, but researchers still discover new ways to bypass its protections.

CERT/CC reported that researcher Dave Kuszmar found a ChatGPT-4o jailbreak vulnerability called Time Bandit. This vulnerability allows users to ask the AI about specific historical events or instruct it to pretend to assist in such events.

“The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly,” CERT/CC explained in an advisory. “Once this historical timeframe has been established in the ChatGPT conversation, the attacker can exploit timeline confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.”

Check Also

Intel

Intel PC, laptop and server processors affected for 6 years: Report

A new class of vulnerabilities in Intel processors, called Branch Predictor Race Conditions (BPRC), enables …

Leave a Reply

Your email address will not be published. Required fields are marked *