Chinese hackers used Claude for a large-scale cyberattack, alleges Anthropic

Anthropic’s Threat Intelligence team, using Claude for analysis, mapped the threat over 10 days, banning accounts, notifying victims, and collaborating with authorities.

November 14, 2025 08:55 IST

The AI even generated post-attack documentation, categorising stolen intel by value. (Image: Reuters)

Anthropic has exposed and disrupted what it calls the world’s first large-scale cyber espionage campaign orchestrated primarily by AI. Detected in mid-September 2025, the operation, which is attributed to a Chinese state-sponsored group, used Anthropic’s Claude Code tool to infiltrate approximately 30 global targets, including tech giants, financial firms, chemical manufacturers, and government agencies.

The discovery reveals how agentic AI systems are empowering cyber threats, executing complex hacks with minimal human intervention. Anthropic’s rapid response halted the breach, but it highlighted how a tool for innovation can be used by malicious actors to convert it into autonomous weapons.

The AI-driven cyberattack: What happened

The campaign exploited Claude’s intelligence, agency, and tool integration – features that have matured dramatically over the past year. Attackers began by jailbreaking Claude, tricking it into bypassing safety guardrails by framing tasks as “defensive testing” for a fictional cybersecurity firm. They broke malicious actions into innocuous steps, avoiding full context disclosure.

In Phase 1, human operators selected targets and built an autonomous framework using Claude Code for reconnaissance. The AI scanned infrastructures at lightning speed, thousands of requests per second. It identified high-value databases far quicker than human hackers could. Next phases involved Claude researching vulnerabilities, crafting exploit code, harvesting credentials, and exfiltrating data, all with limited human check-ins (just 4-6 per operation).

Anthropic detected and stopped the attack

Anthropic’s Threat Intelligence team, using Claude for analysis, mapped the threat over 10 days, banning accounts, notifying victims, and collaborating with authorities. The company stated, “We’re sharing this case publicly to help those in industry, government, and the wider research community strengthen their own cyber defenses. We’ll continue to release reports like this regularly, and be transparent about the threats we find.”

“Our goal is for Claude—into which we’ve built strong safeguards—to assist cybersecurity professionals to detect, disrupt, and prepare for future versions of the attack,” says Anthropic in the report.