Tuesday , March 4 2025
Qwen

ChatGPT, DeepSeek, Qwen 2.5-VL Vulnerable to AI Jailbreaks

This week, multiple research teams showcased jailbreaks for popular AI models, including OpenAI’s ChatGPT, DeepSeek, and Alibaba’s Qwen.

After its launch, the open-source R1 model by Chinese company DeepSeek caught the attention of the cybersecurity industry. Experts found that jailbreak methods, previously patched in other AI models, still function against DeepSeek.

Broadcom Patches 3 VMware Zero-Days Exploited In Attacks

Broadcom issued a security alert on Tuesday, warning VMware customers about 3 exploited zero-day vulnerabilities. Vulnerabilities CVE-2025-22224, CVE-2025-22225, and CVE-2025-22226...
Read More
Broadcom Patches 3 VMware Zero-Days Exploited In Attacks

Singapore issues new guidelines for data center and cloud services

The Infocomm Media Development Authority (IMDA of Singapore unveils advisory guidelines to reduce occurrences of disruptions to cloud services and...
Read More
Singapore issues new guidelines for data center and cloud services

Update Alert!
Google Warns of Critical Android Vulns Under Attack

Google’s March 2025 Android Security Bulletin has unveiled two critical vulnerabilities—CVE-2024-43093 and CVE-2024-50302—currently under limited, targeted exploitation. These flaws affect...
Read More
Update Alert!  Google Warns of Critical Android Vulns Under Attack

CISA adds Cisco and Windows vulns as actively exploited

CISA has advised US federal agencies to secure their systems against attacks targeting vulnerabilities in Cisco and Windows. Although these...
Read More
CISA adds Cisco and Windows vulns as actively exploited

10 New Vulnerabilities Discovered in MediaTek Chipsets

MediaTek has released its March 2025 Product Security Bulletin, which highlights new security vulnerabilities affecting various chipsets in smartphones, tablets,...
Read More
10 New Vulnerabilities Discovered in MediaTek Chipsets

Qualcomm’s March 2025 Security Bulletin Highlights Major Vulns

Qualcomm's March 2025 Security Bulletin addresses vulnerabilities in its products, including automotive systems, mobile chipsets, and networking devices. It includes...
Read More
Qualcomm’s March 2025 Security Bulletin Highlights Major Vulns

Cyberattack detected at Polish space agency, minister says

On Sunday, Poland Minister for Digitalisation Krzysztof Gawkowski said that Polish cybersecurity services found unauthorized access to the IT infrastructure...
Read More
Cyberattack detected at Polish space agency, minister says

Nearly 12,000 API Keys and Passwords Found in Public Datasets

Security researchers found that datasets used by companies to develop large language models included API keys, passwords, and other sensitive...
Read More
Nearly 12,000  API Keys and Passwords Found in Public Datasets

Android Phone’s Unlocked Using Cellebrite’s Zero-day Exploit

Amnesty International’s Security Lab discovered a cyber-espionage campaign in Serbia, where officials used a zero-day exploit from Cellebrite to unlock...
Read More
Android Phone’s Unlocked Using Cellebrite’s Zero-day Exploit

DragonForce Ransomware Targets Saudi Company, 6TB Data Stolen

DragonForce ransomware targets organizations in Saudi Arabia. An attack on a major Riyadh real estate and construction firm led to...
Read More
DragonForce Ransomware Targets Saudi Company, 6TB Data Stolen

AI jailbreaking allows attackers to bypass safeguards designed to stop LLMs from producing harmful content. Security researchers have demonstrated that methods like prompt injection and model manipulation can overcome these protections.

Threat intelligence firm Kela found that DeepSeek is affected by Evil Jailbreak, where a chatbot is made to act as an evil confidant, and Leo, which allows the chatbot to take on an unrestricted persona. ChatGPT has fixed these vulnerabilities.

Palo Alto Networks’ Unit42 reported that DeepSeek is vulnerable to known AI jailbreak techniques.

The security firm successfully conducted the attack known as Deceptive Delight, which tricks generative AI models by embedding unsafe or restricted topics in benign narratives. This method was tested in the fall of 2024 against eight LLMs with an average success rate of 65%.

Palo Alto has successfully executed the Bad Likert Judge jailbreak, which asks the LLM to evaluate the harmfulness of responses using a Likert scale and generate examples that fit the scale.

Researchers discovered that DeepSeek is vulnerable to Crescendo, a jailbreak method that begins with innocuous dialogue and gradually shifts towards forbidden topics.

Alibaba has announced a new version of its Qwen AI model, claiming it outperforms the DeepSeek model.

Kela announced on Thursday that Alibaba’s new Qwen 2.5-VL model has vulnerabilities similar to those recently discovered in DeepSeek.

Researchers at a threat intelligence firm found that jailbreaks designed for DeepSeek also work on Qwen. They successfully tested an existing jailbreak called Grandma, which tricks the model into sharing dangerous information by having it role-play as a grandmother.

Kela found that Qwen 2.5-VL created content about developing ransomware and other malware.

“The ability of AI models to produce infostealer malware instructions raises serious concerns, as cybercriminals could leverage these capabilities to automate and enhance their attack methodologies,” Kela said.

Many jailbreak methods for ChatGPT have been fixed over the years, but researchers still discover new ways to bypass its protections.

CERT/CC reported that researcher Dave Kuszmar found a ChatGPT-4o jailbreak vulnerability called Time Bandit. This vulnerability allows users to ask the AI about specific historical events or instruct it to pretend to assist in such events.

“The jailbreak can be established in two ways, either through the Search function, or by prompting the AI directly,” CERT/CC explained in an advisory. “Once this historical timeframe has been established in the ChatGPT conversation, the attacker can exploit timeline confusion and procedural ambiguity in following prompts to circumvent the safety guidelines, resulting in ChatGPT generating illicit content. This information could be leveraged at scale by a motivated threat actor for malicious purposes.”

Check Also

FIR

Builder claims Rs 150 cr for data loss; AWS faces FIR In Bengaluru

Amazon Web Services (AWS) has been named in an FIR after a builder claimed damages …

Leave a Reply

Your email address will not be published. Required fields are marked *