Thursday , July 3 2025
Qwen

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI’s ChatGPT, DeepSeek, and Alibaba’s Qwen.

After its launch, the open-source R1 model by Chinese company DeepSeek caught the attention of the cybersecurity industry. Experts found that jailbreak methods, previously patched in other AI models, still function against DeepSeek.

CYDES 2025 Reinforces Malaysia’s Vision of Secure and Trusted Digital Nation

The final day of the Cyber Defence & Security Exhibition and Conference (CYDES) 2025 concluded with high-impact engagements at the...
Read More
CYDES 2025 Reinforces Malaysia’s Vision of Secure and Trusted Digital Nation

Cisco alerts that Unified CM has hardcoded root SSH credentials

Cisco warns that a vulnerability in Cisco Unified Communications Manager (Unified CM) and Cisco Unified Communications Manager Session Management Edition...
Read More
Cisco alerts that Unified CM has hardcoded root SSH credentials

CYDES 2025
MCSS to implement 6 strategic goals with 7 objectives over 6 year: NACSA Chief

The second day of the Cyber Defence & Security Exhibition and Conference (CYDES) 2025 further cemented Malaysia’s position as a...
Read More
CYDES 2025  MCSS to implement 6 strategic goals with 7 objectives over 6 year: NACSA Chief

CYDES 2025
Malaysia placed cybersecurity heart of the regional agenda: DPM Ahmad Zahid

Malaysia's Deputy Prime Minister Datuk Seri Dr. Ahmad Zahid Hamidi said that Malaysia has placed cybersecurity at the heart of...
Read More
CYDES 2025  Malaysia placed cybersecurity heart of the regional agenda: DPM Ahmad Zahid

Amid Meta moves; OpenAI is largely shutting down next week: Wired

Mark Chen, the chief research officer at OpenAI, sent a forceful memo to staff on Saturday, promising to go head-to-head...
Read More
Amid Meta moves; OpenAI is largely shutting down next week: Wired

Canada orders Hikvision to close operations over national security

The Canadian government ordered Hikvision to stop all operations in the country due to national security concerns. Hikvision, based in...
Read More
Canada orders Hikvision to close operations over national security

First couple “Rosie” to conceive using AI tech “STAR” successfully

Doctors at Columbia University Fertility Center have reported what they are calling the first pregnancy using a new AI system,...
Read More
First couple “Rosie” to conceive using AI tech “STAR” successfully

Scattered Spider Actively Attacking Aviation and Transportation: FBI

Cybersecurity experts and federal authorities are warning that the Scattered Spider hackers are now targeting aviation and transportation, indicating a...
Read More
Scattered Spider Actively Attacking Aviation and Transportation: FBI

Russia’s restrictions on Cloudflare making websites inaccessible

Since June 9, 2025, Russian users connecting to Cloudflare services have faced throttling by ISPs. As the throttling is being...
Read More
Russia’s restrictions on Cloudflare making websites inaccessible

61 million Verizon records allegedly posted online for sale

A new report from SafetyDetectives reveals that hackers posted a massive 3.1GB dataset online, containing about 61 million records reportedly...
Read More
61 million Verizon records allegedly posted online for sale

AI jailbreaking allows attackers to bypass safeguards designed to stop LLMs from producing harmful content. Security researchers have demonstrated that methods like prompt injection and model manipulation can overcome these protections.

Threat intelligence firm Kela found that DeepSeek is affected by Evil Jailbreak, where a chatbot is made to act as an evil confidant, and Leo, which allows the chatbot to take on an unrestricted persona. ChatGPT has fixed these vulnerabilities.

Palo Alto Networks’ Unit42 reported that DeepSeek is vulnerable to known AI jailbreak techniques.

The security firm successfully conducted the attack known as Deceptive Delight, which tricks generative AI models by embedding unsafe or restricted topics in benign narratives. This method was tested in the fall of 2024 against eight LLMs with an average success rate of 65%.

Palo Alto has successfully executed the Bad Likert Judge jailbreak, which asks the LLM to evaluate the harmfulness of responses using a Likert scale and generate examples that fit the scale.

Researchers discovered that DeepSeek is vulnerable to Crescendo, a jailbreak method that begins with innocuous dialogue and gradually shifts towards forbidden topics.

Alibaba has announced a new version of its Qwen AI model, claiming it outperforms the DeepSeek model.

Kela announced on Thursday that Alibaba’s new Qwen 2.5-VL model has vulnerabilities similar to those recently discovered in DeepSeek.

Researchers at a threat intelligence firm found that jailbreaks designed for DeepSeek also work on Qwen. They successfully tested an existing jailbreak called Grandma, which tricks the model into sharing dangerous information by having it role-play as a grandmother.

Kela found that Qwen 2.5-VL created content about developing ransomware and other malware.

“The ability of AI models to produce infostealer malware instructions raises serious concerns, as cybercriminals could leverage these capabilities to automate and enhance their attack methodologies,” Kela said.

Many jailbreak methods for ChatGPT have been fixed over the years, but researchers still discover new ways to bypass its protections.

CERT/CC reported that researcher Dave Kuszmar found a ChatGPT-4o jailbreak vulnerability called Time Bandit. This vulnerability allows users to ask the AI about specific historical events or instruct it to pretend to assist in such events.

“The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly,” CERT/CC explained in an advisory. “Once this historical timeframe has been established in the ChatGPT conversation, the attacker can exploit timeline confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.”

Check Also

WhatsApp

WhatsApp banned on all US House of Representatives devices

The U.S. House of Representatives has banned congressional staff from using WhatsApp on government devices …

Leave a Reply

Your email address will not be published. Required fields are marked *