How did Chinese labs extract 16 million conversations from Claude?

DeepSeek, Moonshot AI, and MiniMax used approximately 24,000 fraudulent accounts and commercial proxy services known as 'hydra clusters' to bypass regional restrictions. These sprawling networks of accounts distributed traffic across APIs to conduct industrial-scale distillation attacks without authorization.

What is AI model distillation and why is it considered illicit here?

Distillation is a legitimate technique where a smaller 'student' model is trained on the outputs of a larger 'teacher' model. However, Anthropic labels it illicit when competitors systematically extract another company's model capabilities via fraudulent access to build competing products at a fraction of the original R&D cost.

Which Chinese AI labs were named in Anthropic's disclosure?

Anthropic specifically named three labs: MiniMax (13 million exchanges), Moonshot AI (3.4 million exchanges), and DeepSeek (150,000 exchanges). Each campaign targeted different frontier capabilities, including agentic reasoning, tool use, and coding.

How was Claude used to train Chinese censorship systems?

Anthropic documented that DeepSeek used Claude to generate 'censorship-safe' alternatives to queries about dissidents and party leaders. The goal was to use Claude’s nuanced handling of sensitive topics as training data to help DeepSeek’s models automatically suppress similar conversations.

What are the national security risks of AI distillation attacks?

The primary risk is capability proliferation without safety alignment. Distilled models inherit Claude's power but not its safety guardrails (like Constitutional AI), potentially allowing authoritarian regimes to use frontier AI for offensive cyber operations, bioweapons development, or mass surveillance.

Did DeepSeek or Moonshot AI respond to these allegations?

As of late February 2026, DeepSeek, Moonshot AI, and MiniMax have not publicly responded to Anthropic's specific allegations or requests for comment regarding the coordinated extraction campaigns.

How Three Chinese AI Labs Secretly Extracted 16 Million Conversations from Claude

Anthropic publicly accused DeepSeek, Moonshot AI, and MiniMax of using 24,000 fake accounts to conduct industrial-scale distillation attacks on Claude.

by Dani
Dani
April 15, 2026
•
9 min read

On February 23, 2026, Anthropic published a detailed disclosure that named three prominent Chinese AI laboratories by name and accused them of running coordinated, industrial-scale campaigns to extract capabilities from its Claude models without authorization.

The numbers were specific and large: DeepSeek, Moonshot AI, and MiniMax collectively generated more than 16 million exchanges with Claude through approximately 24,000 fraudulent accounts, all in violation of Anthropic's terms of service and regional access restrictions. The technique they used has a name: distillation. The scale at which they used it, and what some of the extracted data was for, is what elevated this from a terms-of-service dispute to a national security disclosure.

The disclosure is the most detailed public evidence yet of a practice that had circulated in Silicon Valley as rumor and speculation for over a year: Chinese AI companies systematically using American frontier models as training infrastructure for their own systems.

What Distillation Is, and Why the Line Matters

Distillation is a standard, legitimate technique in AI development. A larger, more capable "teacher" model generates outputs, and a smaller "student" model is trained on those outputs to approximate the teacher's performance at lower computational cost. Every major frontier lab uses it to create smaller, cheaper versions of their own models for commercial deployment.

The line between legitimate and illicit distillation is straightforward in principle: you can distill your own models. You cannot use fraudulent accounts to systematically extract another company's model capabilities, package those extractions as training data, and use them to build competing products without the model provider's knowledge or consent.

Anthropic acknowledged the distinction directly in its disclosure: "Distillation is a widely used and legitimate training method. For example, frontier AI labs routinely distill their own models to create smaller, cheaper versions for their customers. But distillation can also be used for illicit purposes: competitors can use it to acquire powerful capabilities from other labs in a fraction of the time, and at a fraction of the cost, that it would take to develop them independently."

The industry had already seen smaller-scale demonstrations of the technique's power. In January 2025, researchers at UC Berkeley recreated OpenAI's reasoning model for approximately $450 in 19 hours. Stanford and University of Washington researchers built their own version in 26 minutes for under $50. What Anthropic documented in February 2026 was a different order of magnitude entirely.

The Three Campaigns, By the Numbers

DeepSeek: 150,000 Exchanges and a Censorship Angle

DeepSeek's operation was the smallest in volume at over 150,000 exchanges, but it included the most politically sensitive element of Anthropic's disclosure.

The labs targeted Claude's reasoning capabilities and used rubric-based grading tasks designed to make Claude function as a reward model for reinforcement learning. Anthropic also documented a distinct secondary use case: DeepSeek used Claude to generate censorship-safe alternatives to politically sensitive queries, including questions about dissidents, party leaders, and authoritarianism.

The purpose, Anthropic assessed, was to train DeepSeek's own models to steer conversations away from those topics automatically. Claude's careful, contextually nuanced handling of politically sensitive questions was being used as training infrastructure for a system specifically designed to suppress those same conversations.

Anthropic said it traced the accounts to specific researchers at DeepSeek by examining request metadata. DeepSeek generated synchronized traffic across accounts with identical patterns, shared payment methods, and coordinated timing, which Anthropic described as consistent with load balancing to maximize throughput and evade detection.

Moonshot AI: 3.4 Million Exchanges Targeting Agentic Capabilities

Moonshot AI, the Beijing-based company behind the Kimi models, conducted the second-largest operation by volume, with more than 3.4 million exchanges.

Moonshot's target set was distinct: agentic reasoning and tool use, coding and data analysis, computer-use agent development, and computer vision. These are the capabilities at the frontier of what current AI systems can do, representing the shift from conversational AI to AI that can operate autonomously in digital environments.

The campaign used hundreds of fraudulent accounts spanning multiple access pathways, which Anthropic said made it harder to detect as a coordinated operation. The company attributed the campaign through request metadata that matched the public profiles of senior Moonshot staff by name.

In a later phase of the campaign, Moonshot shifted to a more targeted approach, attempting to extract and reconstruct Claude's reasoning traces specifically: the internal chain-of-thought steps Claude takes before producing a response. This is among the most valuable and difficult-to-replicate aspects of a frontier reasoning model.

MiniMax: 13 Million Exchanges and a Real-Time Pivot

MiniMax ran the largest campaign by a wide margin, accounting for more than 13 million of the 16 million total exchanges. The focus was agentic coding and tool use capabilities at scale.

The MiniMax campaign produced the single most operationally revealing detail in Anthropic's disclosure. Anthropic detected the campaign while it was still active, before MiniMax released the model it was training. This gave Anthropic what it described as "unprecedented visibility into the life cycle of distillation attacks, from data generation through to model launch."

During this active campaign, Anthropic released a new version of Claude. MiniMax pivoted within 24 hours, redirecting nearly half its traffic to capture capabilities from the updated system. That 24-hour turnaround reflects an operation that was actively managed and responded to model updates in near-real-time, rather than a one-time bulk extraction.

Anthropic attributed the campaign through request metadata and infrastructure indicators, and confirmed the timing against MiniMax's public product roadmap.

How They Got Access

Claude is not commercially available in China, a restriction Anthropic maintains for legal, regulatory, and security reasons. The question of how three Chinese AI labs generated millions of Claude exchanges despite that restriction has a specific answer.

Commercial proxy services resell access to Claude and other Western AI models at scale. These services operate what Anthropic calls "hydra cluster" architectures: sprawling networks of fraudulent accounts distributed across Anthropic's API and third-party cloud platforms.

The architecture is specifically resistant to countermeasures: when one account is banned, a new one takes its place. The services mix distillation traffic with unrelated legitimate customer requests to make coordinated extraction harder to distinguish from normal usage patterns. In one case, Anthropic documented a single proxy network managing more than 20,000 fraudulent accounts simultaneously.

These proxy services represent a layer of infrastructure specifically built to enable access at scale, existing entirely outside the reach of Anthropic's regional restrictions, and no individual account ban or IP block addresses the underlying network.

The Safety Argument

Anthropic's disclosure was not framed purely as an intellectual property complaint but as a specific argument about what happens to safety when capabilities are extracted through distillation rather than trained from the ground up with safety frameworks built in.

"Anthropic and other U.S. companies build systems that prevent state and non-state actors from using AI to, for example, develop bioweapons or carry out malicious cyber activities," the blog post stated. "Models built through illicit distillation are unlikely to retain those safeguards, meaning that dangerous capabilities can proliferate with many protections stripped out entirely."

The concern is structural. A model distilled from Claude inherits Claude's capabilities but not the Constitutional AI training, the reinforcement learning from human feedback on safety behaviors, or the extensive red-teaming that shapes how Claude responds to sensitive requests. What gets extracted is the knowledge; what does not transfer is the value alignment that governs how that knowledge is applied.

Anthropic warned that these extracted capabilities could be fed into "military, intelligence, and surveillance systems, enabling authoritarian governments to deploy frontier AI for offensive cyber operations, disinformation campaigns, and mass surveillance." The censorship use case documented in the DeepSeek findings provided a concrete example of capabilities being redirected away from the safety-oriented applications they were designed for.

The Export Controls Dimension

Anthropic's disclosure landed in the middle of an active policy debate about whether and how to extend US chip export controls, and the timing was not coincidental.

Anthropic argued that distillation attacks actually reinforce the case for export controls rather than undermining it: "Distillation attacks therefore reinforce the rationale for export controls: restricted chip access limits both direct model training and the scale of illicit distillation." Running 13 million high-volume API exchanges, structured for maximum capability extraction, requires substantial compute infrastructure. Constrained chip access limits the scale at which this kind of operation can be run.

On the same day Anthropic published its disclosure, Reuters reported that the US government had found evidence that DeepSeek had trained its AI model using Nvidia's flagship Blackwell chips, apparently in violation of existing export controls. The two disclosures reinforced each other in the same news cycle.

Dmitri Alperovitch, chairman of the Silverado Policy Accelerator and co-founder and former CTO of CrowdStrike, told TechCrunch he was not surprised by Anthropic's findings. The disclosure accelerated calls in Washington for extending controls to cover model API access itself, potentially requiring Chinese entities to obtain licenses before accessing American AI models. Such controls would represent a significant escalation from the current chip-focused framework.

Wrap up

Anthropic's February 23 disclosure reframed a months-long industry rumor as a documented, attribution-conf ident public record. The 16 million exchanges across 24,000 accounts are the largest publicly documented example of industrial-scale AI distillation attacks conducted against a single model provider.

The disclosure sits at the intersection of several major ongoing debates: the effectiveness of US export controls, the legal status of AI model outputs, the safety implications of capability proliferation without safety training, and the competitive dynamics of the US-China AI race.

These questions are not resolved by the disclosure. What it provides is specific, technically detailed evidence that the practice is real, systematic, and operating at a scale that exceeds what any single company can address through account bans and IP blocks alone.

Anthropic's closing position in the blog post was explicit: "No company can solve this alone. Distillation attacks at this scale require a coordinated response across the AI industry, cloud providers, and policymakers." The window to structure that response, as the company put it, is narrow.

Frequently Asked Questions

What is AI model distillation, and why is it controversial in this case?

Distillation is a standard AI training technique where a smaller model is trained on the outputs of a larger, more capable one to approximate its performance at lower cost. It is widely used and legitimate when applied to a company's own models. It becomes illicit when competitors use fraudulent accounts to systematically extract another company's model capabilities without authorization, as Anthropic alleges DeepSeek, Moonshot, and MiniMax did. The controversy is compounded by the fact that the technique itself is industry-standard, making the line between legitimate use and adversarial extraction a contested boundary.

What specifically did each Chinese lab extract from Claude?

DeepSeek conducted approximately 150,000 exchanges targeting reasoning capabilities, reinforcement learning reward model data, and censorship-safe alternatives to politically sensitive queries. Moonshot AI conducted 3.4 million exchanges focused on agentic reasoning, tool use, coding, computer-use agent development, and computer vision, and later attempted to extract Claude's reasoning traces. MiniMax conducted 13 million exchanges targeting agentic coding and tool use capabilities.

How did these labs access Claude if it is banned in China?

The labs used commercial proxy services that resell access to Western AI models at scale. These services operate "hydra cluster" architectures: networks of thousands of fraudulent accounts distributed across APIs and cloud platforms. When accounts are banned, new ones replace them automatically. In one documented case, a single proxy network managed more than 20,000 accounts simultaneously.

What is the censorship angle in DeepSeek's campaign?

Anthropic documented that DeepSeek used Claude to generate censorship-safe alternatives to queries about dissidents, party leaders, and authoritarianism. The purpose appeared to be training DeepSeek's own models to automatically steer away from those topics. This represents the use of an American safety-focused AI as training infrastructure for a Chinese censorship system.

What are the safety concerns about distilled models?

Anthropic argues that models distilled from Claude inherit its capabilities but not its safety training, specifically the Constitutional AI framework, reinforcement learning from human feedback on safety behaviors, and red-teaming that shapes responses to sensitive requests. The concern is that powerful capabilities could proliferate with safety guardrails stripped out, potentially enabling offensive cyber operations, disinformation campaigns, or mass surveillance by authoritarian governments.

Have DeepSeek, Moonshot, or MiniMax responded?

As of late February 2026, none of the three companies had publicly responded to Anthropic's allegations or to media requests for comment from CNN, CNBC, TechCrunch, or other outlets.