# Chinese Labs Accused of Distilling Claude Amid US Chip Debates
In a significant escalation of AI competitive tensions, Anthropic has accused three major Chinese AI laboratories—DeepSeek, Moonshot AI, and MiniMax—of conducting large-scale distillation attacks on its Claude model[1][2]. The allegations reveal an industrial-scale operation involving over 24,000 fraudulent accounts that generated more than 16 million queries targeting Claude's most advanced capabilities, coinciding with ongoing US policy debates over AI chip export controls[1][2].
The Scale of the Distillation Campaign
The distillation attacks represent one of the most significant IP theft operations in AI development history[1]. MiniMax led the campaign with over 13 million requests, specifically targeting agent-based programming, tool usage, and orchestration capabilities[1]. Moonshot AI conducted approximately 3.4 million exchanges focused on agent-based reasoning, tool usage, programming, and data analysis[1]. DeepSeek, despite running a smaller operation with 150,000+ requests, took a more targeted approach by extracting Claude's reasoning chains and censorship-compliant answers on politically sensitive topics[1].
The sophistication of the attacks extended beyond simple query volume. Anthropic observed that MiniMax rapidly pivoted to target new Claude model versions within 24 hours of their release, redirecting nearly half of its traffic to the updated systems[1][2]. The labs utilized proxy services to bypass China's access restrictions on Claude, circumventing both regional access controls and Anthropic's terms of service[2][6].
Distillation as a Competitive Threat
Distillation is a training technique where a weaker model learns from the outputs of a stronger one, allowing competitors to essentially replicate the capabilities of more advanced systems[2]. While distillation is a legitimate method used by AI labs to create smaller, more efficient versions of their own models, the scale and intent of these operations transformed it into a form of industrial espionage[2].
Anthropic's investigation identified deliberate capability extraction rather than legitimate use through analysis of prompt patterns, volume structure, and focus areas[6]. The Chinese labs specifically targeted Claude's most differentiated capabilities: agentic reasoning, tool use, and coding—the exact features that distinguish frontier AI models in competitive markets[2][6]. In one notable technique, the attackers asked Claude to articulate its internal reasoning step-by-step, effectively generating chain-of-thought training data at scale[6].
Implications for US Export Control Policy
The timing of these accusations carries significant weight in ongoing policy debates about AI chip export controls[2]. Anthropic's leadership has explicitly called for stricter enforcement of export restrictions, arguing that distillation attacks undermine the intended protective effects of chip export controls[2][6]. Dmitry Alperovitch, Anthropic's chief strategy officer, stated: "It's been clear for a while now that part of the reason for the rapid progress of Chinese AI models has been theft via distillation of US frontier models. Now we know this for a fact."[2]
The allegations suggest that Chinese AI labs have leveraged distilled Claude capabilities to accelerate their own model development. DeepSeek, which gained international attention a year ago with its open-source R1 reasoning model that matched American frontier labs at a fraction of the cost, is expected to release DeepSeek V4, reportedly capable of outperforming both Claude and OpenAI's ChatGPT in coding tasks[2]. This progress trajectory raises questions about how much of the advancement stems from independent innovation versus distilled capabilities.
Industry Response and Defense Measures
The distillation attacks are not isolated to Anthropic[1]. OpenAI and Google have reported similar attempts from Chinese labs, indicating a broader pattern of capability extraction across the AI industry[1]. However, Anthropic's detailed public disclosure represents an unusually transparent approach to addressing the issue[1][6].
Anthropic has announced plans to invest in defenses that make distillation attacks harder to execute and easier to identify[2]. The company is simultaneously calling on "a coordinated response across the AI industry, cloud providers, and policymakers"[2]. This appeal for industry-wide action reflects recognition that individual company defenses alone may be insufficient against well-resourced, state-aligned actors[2][6].
For national security reasons, Anthropic has restricted commercial access to Claude in China and to subsidiaries of Chinese companies located outside the country[6]. However, the distillation campaigns demonstrate that determined actors can circumvent such restrictions through proxy services and fraudulent accounts[2][6].
Frequently Asked Questions
What is distillation in AI, and how does it differ from legitimate use?
Distillation is a training technique where a smaller or weaker AI model learns from the outputs of a more advanced model[2]. While AI labs legitimately use distillation to create efficient versions of their own models, the Chinese labs' campaigns involved using fraudulent accounts at industrial scale to extract specific capabilities without authorization[1][6]. Legitimate distillation focuses on general knowledge transfer, whereas the accused labs specifically targeted Claude's most differentiated features like reasoning chains and tool usage[1][6].
How did the Chinese labs bypass Anthropic's access restrictions?
The labs used proxy services to circumvent China's access restrictions on Claude and created over 24,000 fraudulent accounts to conduct the attacks[1][2]. This combination allowed them to access Claude at scale while evading detection through normal usage pattern analysis[6]. Anthropic was able to attribute the campaigns through IP address correlation, request metadata, infrastructure indicators, and corroboration from other companies who observed the same actors[6].
What specific capabilities were the labs targeting?
The labs focused on Claude's most differentiated capabilities: agentic reasoning, tool use, and coding[2][6]. More specifically, they extracted reasoning steps, reward model data for reinforcement learning, programming capabilities, data analysis functions, and computer vision features[1]. DeepSeek additionally targeted censorship-compliant responses to politically sensitive topics[1][6].
How do these accusations relate to US AI chip export controls?
The distillation attacks demonstrate how foreign labs can circumvent the intended protective effects of export controls by directly accessing frontier models through fraudulent means[2][6]. Anthropic argues that allowing chip sales to these companies would only accelerate their progress further, making the case for stricter export enforcement[2]. The accusations have intensified policy debates about how strictly to enforce restrictions on advanced AI chip exports[2].
What is the broader context of Chinese AI development?
Chinese AI labs have made rapid progress in recent years, with models like DeepSeek's R1 matching American frontier labs in performance at significantly lower costs[2]. Multiple Chinese labs—including Moonshot, MiniMax, Alibaba's Qwen, and Zhipu's GLM—are releasing advanced models that compete directly with GPT, Claude, and Gemini[5]. However, Anthropic's investigation suggests that distillation of US models has contributed meaningfully to this progress trajectory[1][2].
What defenses and responses is Anthropic implementing?
Anthropic is investing in technical defenses to make distillation attacks harder to execute and easier to identify[2]. The company is also calling for coordinated action across the AI industry, cloud providers, and policymakers[2]. Additionally, Anthropic has restricted Claude's commercial availability in China and to Chinese company subsidiaries outside the country[6]. However, the company acknowledges that individual defenses alone are insufficient against well-resourced actors[2].