Tuesday , June 24 2025
Passwords

Nearly 12,000 API Keys and Passwords Found in Public Datasets

Security researchers found that datasets used by companies to develop large language models included API keys, passwords, and other sensitive credentials.

Large language models are dominating the online landscape, with companies promoting AI solutions that claim to solve all problems.

WhatsApp banned on all US House of Representatives devices

The U.S. House of Representatives has banned congressional staff from using WhatsApp on government devices due to security concerns, as...
Read More
WhatsApp banned on all US House of Representatives devices

Kaspersky found “SparkKitty” Malware on Google Play, Apple App Store

Kaspersky found a new mobile malware dubbed SparkKitty in Google Play and Apple App Store apps, targeting Android and iOS....
Read More
Kaspersky found “SparkKitty” Malware on Google Play, Apple App Store

OWASP AI Testing Guide Launched to Uncover Vulns in AI Systems

OWASP has released its AI Testing Guide, a framework to help organizations find and fix vulnerabilities specific to AI systems....
Read More
OWASP AI Testing Guide Launched to Uncover Vulns in AI Systems

Axentec Launches Bangladesh’s First Locally Hosted Tier-4 Cloud Platform

In a major milestone for the country’s digital infrastructure, Axentec PLC has officially launched Axentec Cloud, Bangladesh’s first Tier-4 cloud...
Read More
Axentec Launches Bangladesh’s First Locally Hosted Tier-4 Cloud Platform

Hackers Bypass Gmail MFA With App-Specific Password Reuse

A hacking group reportedly linked to Russian government has been discovered using a new phishing method that bypasses two-factor authentication...
Read More
Hackers Bypass Gmail MFA With App-Specific Password Reuse

Russia detects first SuperCard malware attacks via NFC

Russian cybersecurity experts discovered the first local data theft attacks using a modified version of legitimate near field communication (NFC)...
Read More
Russia detects first SuperCard malware attacks via NFC

Income Property Investments exposes 170,000+ Individuals record

Cybersecurity researcher Jeremiah Fowler discovered an unsecured database with 170,360 records belonging to a real estate company. It contained personal...
Read More
Income Property Investments exposes 170,000+ Individuals record

ALERT (CVE: 2023-28771)
Zyxel Firewalls Under Attack via CVE-2023-28771 by 244 IPs

GreyNoise found attempts to exploit CVE-2023-28771, a vulnerability in Zyxel's IKE affecting UDP port 500. The attack centers around CVE-2023-28771,...
Read More
ALERT (CVE: 2023-28771)  Zyxel Firewalls Under Attack via CVE-2023-28771 by 244 IPs

CISA Flags Active Exploits in Apple iOS and TP-Link Routers

The U.S. Cybersecurity and Infrastructure Security Agency (CISA) has recently included two high-risk vulnerabilities in its Known Exploited Vulnerabilities (KEV)...
Read More
CISA Flags Active Exploits in Apple iOS and TP-Link Routers

10K Records Allegedly from Mac Cloud Provider’s Customers Leaked Online

SafetyDetectives’ Cybersecurity Team discovered a public post on a clear web forum in which a threat actor claimed to have...
Read More
10K Records Allegedly from Mac Cloud Provider’s Customers Leaked Online

For an AI to be effective, it needs extensive training data, much of which is gathered from the Internet by specialized companies and organizations.

Common Crawl provides datasets for companies to train their AI, gathering information from the internet, which may include sensitive data.

Researchers from Truffle Security discovered that credentials, API keys, and passwords are being exposed. The main issue is that some web developers hardcode sensitive information into websites, which then ends up in LLM training data.

Researchers discovered 11,908 live secrets, such as API keys and passwords, across 2.76 million websites.

“Leaked keys in Common Crawl’s dataset should not reflect poorly on their organization; it’s not their fault developers hardcode keys in front-end HTML and JavaScript on web pages they don’t control. And Common Crawl should not be tasked with redacting secrets; their goal is to provide a free, public dataset based on the public Internet for organizations like Truffle Security to conduct this type of research,” explained the researchers.

Companies that create LLMs have warned against hardcoding sensitive information on websites. They advise avoiding this practice, as users may unintentionally share the code in their work, worsening the issue.

Check Also

GreyNoise

ALERT (CVE: 2023-28771)
Zyxel Firewalls Under Attack via CVE-2023-28771 by 244 IPs

GreyNoise found attempts to exploit CVE-2023-28771, a vulnerability in Zyxel’s IKE affecting UDP port 500. …

Leave a Reply

Your email address will not be published. Required fields are marked *