Sunday , February 2 2025
Qwen

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI’s ChatGPT, DeepSeek, and Alibaba’s Qwen.

After its launch, the open-source R1 model by Chinese company DeepSeek caught the attention of the cybersecurity industry. Experts found that jailbreak methods, previously patched in other AI models, still function against DeepSeek.

US scientists claim to replicate DeepSeek for $30 dubbed “TinyZero,”

Researchers at the University of California, Berkeley, claims they’ve managed to reproduce the core technology behind DeepSeek’s at a total...
Read More
US scientists claim to replicate DeepSeek for $30 dubbed “TinyZero,”

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI's ChatGPT, DeepSeek, and Alibaba's Qwen. After its...
Read More
ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

Paragon Attack WhatsApp With New Zero-Click Spyware

WhatsApp reveiled on Friday that a "zero-click" spyware attack, linked to the Israeli company Paragon, has targeted many users globally,...
Read More
Paragon Attack WhatsApp With New Zero-Click Spyware

Everything I Say Leaks,’ Zuckerberg Says in Leaked Meeting Audio

At an all-hands meeting at Meta on Thursday, Mark Zuckerberg did not mention the company's $25 million settlement with Donald...
Read More
Everything I Say Leaks,’ Zuckerberg Says in Leaked Meeting Audio

Indian tech giant Tata Tech hit by ransomware attack

Tata Technologies reported a ransomware incident affecting some IT services, but it did not disrupt client deliveries, according to a...
Read More
Indian tech giant Tata Tech hit by ransomware attack

Vulnarabilitties found in Cisco webex and VMware Aria operation

A serious cybersecurity flaw in Cisco Webex Chat has been discovered, allowing unauthorized attackers to access the chat histories of...
Read More
Vulnarabilitties found in Cisco webex and VMware Aria operation

Microsoft to boost M365 bounty program rewards Up to $27,000

Microsoft has announced a major expansion of its Microsoft 365 Bounty Program. The program now covers new Viva products for...
Read More
Microsoft to boost M365 bounty program rewards Up to $27,000

DeepSeek reveils over 1 million chat records; Italy Bans DeepSeek

Chinese AI startup DeepSeek has exposed two databases with sensitive user and operational information from its DeepSeek-R1 LLM model. Unsecured...
Read More
DeepSeek reveils over 1 million chat records; Italy Bans DeepSeek

Microsoft brings DeepSeeK to Azure AI Foundry and GitHub

Microsoft has added DeepSeek’s R1 AI model to its Azure AI Foundry platform and GitHub. This lets customers easily integrate...
Read More
Microsoft brings DeepSeeK to Azure AI Foundry and GitHub

Hackers leverage Google’s subdomains, phone number to attack victims

Scammers called a victim using Google's official support number and sent an email from an official subdomain. It's unclear how...
Read More
Hackers leverage Google’s subdomains, phone number to attack victims

AI jailbreaking allows attackers to bypass safeguards designed to stop LLMs from producing harmful content. Security researchers have demonstrated that methods like prompt injection and model manipulation can overcome these protections.

Threat intelligence firm Kela found that DeepSeek is affected by Evil Jailbreak, where a chatbot is made to act as an evil confidant, and Leo, which allows the chatbot to take on an unrestricted persona. ChatGPT has fixed these vulnerabilities.

Palo Alto Networks’ Unit42 reported that DeepSeek is vulnerable to known AI jailbreak techniques.

The security firm successfully conducted the attack known as Deceptive Delight, which tricks generative AI models by embedding unsafe or restricted topics in benign narratives. This method was tested in the fall of 2024 against eight LLMs with an average success rate of 65%.

Palo Alto has successfully executed the Bad Likert Judge jailbreak, which asks the LLM to evaluate the harmfulness of responses using a Likert scale and generate examples that fit the scale.

Researchers discovered that DeepSeek is vulnerable to Crescendo, a jailbreak method that begins with innocuous dialogue and gradually shifts towards forbidden topics.

Alibaba has announced a new version of its Qwen AI model, claiming it outperforms the DeepSeek model.

Kela announced on Thursday that Alibaba’s new Qwen 2.5-VL model has vulnerabilities similar to those recently discovered in DeepSeek.

Researchers at a threat intelligence firm found that jailbreaks designed for DeepSeek also work on Qwen. They successfully tested an existing jailbreak called Grandma, which tricks the model into sharing dangerous information by having it role-play as a grandmother.

Kela found that Qwen 2.5-VL created content about developing ransomware and other malware.

“The ability of AI models to produce infostealer malware instructions raises serious concerns, as cybercriminals could leverage these capabilities to automate and enhance their attack methodologies,” Kela said.

Many jailbreak methods for ChatGPT have been fixed over the years, but researchers still discover new ways to bypass its protections.

CERT/CC reported that researcher Dave Kuszmar found a ChatGPT-4o jailbreak vulnerability called Time Bandit. This vulnerability allows users to ask the AI about specific historical events or instruct it to pretend to assist in such events.

“The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly,” CERT/CC explained in an advisory. “Once this historical timeframe has been established in the ChatGPT conversation, the attacker can exploit timeline confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.”

Check Also

BitLocker Encryption

Memory-Dump-UEFI
Researcher dumping memory to bypass BitLocker on Windows 11

Researchers have demonstrated a method to bypass Windows 11’s BitLocker encryption, enabling the extraction of …

Leave a Reply

Your email address will not be published. Required fields are marked *