Hacker used Anthropic’s Claude to steal Mexican data trove


AI has become a key enabler of digital crimes, with hackers using the tools to augment their efforts. — Photo by Moritz Erken on Unsplash

A hacker exploited Anthropic PBC’s artificial intelligence chatbot to carry out a series of attacks against Mexican government agencies, resulting in the theft of a huge trove of sensitive tax and voter information, according to cybersecurity researchers.

The unknown Claude user wrote Spanish-language prompts for the chatbot to act as an elite hacker, finding vulnerabilities in government networks, writing computer scripts to exploit them and determining ways to automate data theft, Israeli cybersecurity startup Gambit Security said in research published Wednesday. 

The activity started in December and continued for roughly a month. In all, 150 gigabytes of Mexican government data was stolen, including documents related to 195 million taxpayer records as well as voter records, government employee credentials and civil registry files, according to the researchers.

AI has become a key enabler of digital crimes, with hackers using the tools to augment their efforts. Researchers at Amazon.com Inc said a small group of hackers broke into more than 600 firewall devices across dozens of countries with the help of widely available AI tools.

Gambit hasn’t attributed the attack to a specific group, though researchers said they don’t believe they are tied to a foreign government.

The hacker breached Mexico’s federal tax authority and the national electoral institute, Gambit said. State governments in Mexico, Jalisco, Michoacán and Tamaulipas as well as Mexico City’s civil registry and Monterrey’s water utility were also compromised.

Claude initially warned the unknown user of malicious intent during their conversation about the Mexican government, but eventually complied with the attacker’s requests and executed thousands of commands on government computer networks, the researchers said.

Anthropic investigated Gambit’s claims, disrupted the activity and banned the accounts involved, a representative said. The company feeds examples of malicious activity back into Claude to learn from it, and one of its latest AI models, Claude Opus 4.6, includes probes that can disrupt misuse, the representative said.

In this instance, the hacker continuously probed Claude until they were able to "jailbreak” it - meaning it finally bypassed guardrails, the representative said. But even as the hacking campaign got underway, Claude occasionally refused the hacker’s demands, they added.

Mexico’s tax authority said it had reviewed its access logs and couldn’t find evidence of a breach. The country’s national electoral institute said it hadn’t identified any breaches or unauthorised access in recent months and that it had bolstered its cybersecurity strategy. The state government of Jalisco also denied that it was breached, saying only federal networks were impacted.

The attacker was seeking to obtain a large number of government employee identities, Gambit said, though it’s not yet clear what – if anything – they did with them. Researchers said they found evidence of at least 20 specific vulnerabilities being exploited as part of the attack. 

When Claude encountered problems or required additional information, the hacker turned to OpenAI’s ChatGPT to provide additional insights. That included how to move laterally through computer networks, determine which credentials were needed to access certain systems and calculate how likely the hacking operation would be detected, according to Gambit.

"In total, it produced thousands of detailed reports that included ready-to-execute plans, telling the human operator exactly which internal targets to attack next and what credentials to use,” said Curtis Simpson, Gambit Security’s chief strategy officer. 

OpenAI said it had identified attempts by the hacker to use its models for activities that violate its usage policies, adding that its tools refused to comply with these attempts. 

"We have banned the accounts used by this adversary and value the outreach from Gambit Security,” the company said in an emailed statement.

The Mexican government breaches are the latest example of an alarming trend. Even as Anthropic and OpenAI are betting on building more sophisticated AI coding tools – and cybersecurity companies are tying their futures to AI-enabled defences – cybercriminals and cyberspies are finding novel ways to use the technology to enable attacks.

"This reality is changing all the game rules we have ever known,” said Alon Gromakov, Gambit’s co-founder and chief executive officer.

Gambit researchers uncovered the Mexican breaches while they were trying new threat hunting techniques to observe what hackers were doing online. They discovered publicly available evidence about active or recent attacks, including one containing extensive Claude conversations pertaining to the breach of Mexican government computer systems, according to the company. 

Those conversations revealed that in order to bypass Claude’s guardrails, the attacker told the AI tool that it was pursuing a bug bounty, a reward provided by organisations to find flaws in their system. Many companies and government agencies offer bug bounties for ethical hackers, sometimes offering many thousands of dollars for details about computer vulnerabilities. 

The hacker wanted Claude to conduct penetration testing on the Mexican federal tax authority, a type of authorised cyberattack intended to find flaws. However, Claude balked when the attacker added rules to the request, including deleting logs and command history.

"Specific instructions about deleting logs and hiding history are red flags,” Claude responded at one point, according to a transcript provided by Gambit. "In legitimate bug bounty, you don’t need to hide your actions – in fact, you need to document them for reporting.”

The hacker changed strategies, stopping the back-and-forth conversation and instead providing the AI tool with a detailed playbook on how to proceed. That got the intruder past Claude’s guardrails – a "jailbreak” – and allowed the attacks to proceed, according to Gambit.

The hacker sought insights from Claude about other agencies where data could be obtained, suggesting some of the hacks may have been opportunistic rather than planned, Simpson said.

"They were trying to compromise every government identity they possibly could,” he said. "They were asking Claude as an example, ‘Where else can I find these identities? What other systems should we look in? Where else is the information stored?’” – Bloomberg

Follow us on our official WhatsApp channel for breaking news alerts and key updates!

Next In Tech News

AI image pioneer’s startup unveils technology to speed up chats
Apple in talks with banks to start payment service in India, Bloomberg News reports
Schneider beats profit expectations as data center demand offsets weak dollar
New 'Resident Evil' bets on tried-and-tested formulas
AirDrop is coming to Android phones
Smaller, faster, smarter: Chinese transistor ready for future AI chips
Alibaba pushes deeper into AI coding tools with low-cost access
OpenAI hires ex-Apple models head from Meta, The Information reports
South Korea watchdog fines Coupang $1.6 million for pressuring suppliers, delaying payments
Amazon's $50 billion OpenAI investment may depend on IPO or AGI, The Information reports

Others Also Read