The Distillation Wars: What the Claude-Copying Allegations Mean for AI Strategy
A briefing on the evidence, the nuance, and what it signals about the durability of frontier-model advantage
For most of 2024 and 2025, the claim that Chinese open-weight labs were “just distilling GPT-4 and Claude” was a talking point — repeated in comment threads and congressional hearings, but rarely backed by anything more concrete than benchmark suspicion. That changed in 2026. Twice this year, a frontier US lab moved from suspicion to specific, quantified, named accusation. The result is one of the clearest public windows anyone has had into how capability actually diffuses from proprietary frontier models into the open-weight ecosystem — and it has direct implications for any executive whose competitive strategy assumes that a model advantage, once built, stays built.
This briefing separates what is now documented from what remains speculative, and draws out what the pattern means strategically.
The allegations, quantified
On February 23, 2026, Anthropic published a detailed account titled “Detecting and Preventing Distillation Attacks,” accusing three Chinese AI laboratories — DeepSeek, Moonshot AI (maker of the Kimi models), and MiniMax — of running what it called industrial-scale extraction campaigns against Claude.¹ The company’s own figures, corroborated independently by Bloomberg, CNBC, and the South China Morning Post, were specific:²·³·⁴
- Roughly 24,000 fraudulent accounts were used in total, in violation of Anthropic’s terms of service and regional access restrictions.
- Those accounts generated more than 16 million exchanges with Claude.
- MiniMax accounted for the largest share — roughly 13 million exchanges, concentrated on agentic coding and tool orchestration. Anthropic reported that when it shipped a new Claude model during MiniMax’s active campaign, MiniMax redirected nearly half its traffic to the new system within 24 hours.¹
- Moonshot AI generated roughly 3.4 million exchanges, initially targeting agentic reasoning and computer-use tasks, later shifting to a more targeted effort to reconstruct Claude’s internal reasoning traces. Anthropic said it attributed the campaign partly through request metadata that matched the public professional profiles of senior Moonshot staff.¹
- DeepSeek’s share was the smallest, around 150,000 exchanges, focused on reasoning tasks, rubric-based grading, and rewrites of politically sensitive queries.¹
Four months later, the scale escalated. In a June 10, 2026 letter to the U.S. Senate Banking Committee, Anthropic alleged that operators affiliated with Alibaba’s Qwen division had run the largest distillation campaign it had ever disclosed: roughly 25,000 fraudulent accounts generating 28.8 million exchanges with Claude between April 22 and June 5, 2026, targeting software engineering, agentic reasoning, and cybersecurity capabilities.⁵·⁶·⁷ Alibaba has denied the allegations, and — as of this writing — the letter itself has not been made public in full.⁶
This followed an earlier, structurally similar pattern: OpenAI accused DeepSeek of comparable behavior in 2025, alleging that DeepSeek employees wrote code to programmatically query OpenAI’s models through obfuscated third-party routers to harvest training data at scale.⁸·⁹
Why this is a strategy story, not just a tech story
Three things make this more than an IP squabble between labs.
First, it quantifies how cheaply a capability gap can be closed. Anthropic itself does not dispute that distillation is a legitimate, industry-standard technique — the company’s own post acknowledges that “frontier AI labs routinely distill their own models to create smaller, cheaper versions.”¹ What it alleges is different: that the technique was pointed outward, at a competitor’s proprietary system, at industrial scale, through fraudulent access. If the figures hold up, tens of millions of dollars of R&D and years of safety-alignment work can be substantially replicated for the marginal cost of API calls and account-creation infrastructure. For any company whose moat depends on model capability rather than distribution, data, or workflow lock-in, that is a sobering data point.
Second, it exposes a structural asymmetry in the open-versus-closed model competition. Anthropic does not sell commercial Claude access in China, so by its own account, every one of the accused accounts was operating in violation of restrictions from the outset — meaning the normal deterrents against this kind of activity (contract enforcement, account termination, reputational cost) have limited reach across borders.² This is a genuinely difficult enforcement problem, and it is not obviously solvable by any individual company’s terms of service.
Third, it reframes the “value” of a closed frontier model. One financial commentator’s rebuttal of the panic is worth noting for the alternative reading it offers: the act of copying a system is itself, in that view, evidence the copy is behind — a lagging indicator of leadership, not proof that leadership has been erased.¹⁰ Executives should hold both readings simultaneously: the copying is real and consequential for near-term competitive dynamics, but it does not, by itself, establish that the copier has closed the gap in original research capability.
What the evidence does not show
Discipline matters here, because this story is easy to overstate in either direction.
- GLM (Zhipu/Z.ai) has not been named in any formal accusation by Anthropic, OpenAI, or any other party identified in current reporting. Zhipu’s difficulties in this period — a GPU shortage and a rocky GLM-5 rollout — are a separate story entirely.¹¹ Online speculation that Zhipu is quietly the largest distiller of all is commentary, not documented finding, and should be treated accordingly.
- The specific claim that early Chinese models show “74.2% stylistic similarity to ChatGPT” — sometimes cited alongside this story — does not trace to any primary source, court filing, or reputable outlet found in current research. It should not be repeated as fact.
- These remain single-sourced, contested allegations. Anthropic is a direct commercial competitor to the labs it has accused; Alibaba denies wrongdoing; and the underlying forensic evidence (rather than summary figures) has not been independently audited by a neutral third party. Detection was via behavioral fingerprinting and account-metadata correlation — strong circumstantial methods, but not equivalent to examining a competitor’s actual training data.
- Independent academic evidence exists, but it is correlational. A 2026 study measuring agentic behavior found that Kimi-K2 matched Claude’s tool-use patterns — such as proactively offering reassurance and taking redundant verification steps — more closely than some models within Anthropic’s own family match each other.¹² That is consistent with inherited training signal, but consistent-with is not proof-of.
Implications for leaders
- Do not assume a capability lead is a durable moat. If your competitive strategy depends on being the best model or system for 18 months, build the plan assuming a well-resourced competitor can meaningfully close that gap in a fraction of the time it took you to open it — legally or otherwise.
- Distinguish the technique from the violation. Distillation itself is not the scandal; unauthorized, fraudulent-scale extraction is. This distinction matters for how you think about your own use of third-party model outputs, your vendor contracts, and any public commentary your organization makes on this topic.
- Watch enforcement mechanisms, not just headlines. The more interesting long-term story may be the detection tooling — behavioral fingerprinting, traffic classifiers, metadata correlation — that Anthropic disclosed alongside the accusation.¹ Expect API terms, rate limits, and account-verification friction to tighten across the industry as a second-order effect, including for legitimate developers.
- Treat single-sourced claims, even from credible companies, with the same scrutiny you’d apply to a competitor’s press release. The core figures here are well-corroborated by independent outlets, which raises confidence — but “well-reported allegation” and “adjudicated fact” remain different categories, and the gap between them is exactly where geopolitical narratives tend to get overstated.
References
- Anthropic. “Detecting and Preventing Distillation Attacks.” Anthropic News, February 2026. https://www.anthropic.com/news/detecting-and-preventing-distillation-attacks
- Bloomberg. “Anthropic Accuses DeepSeek, MiniMax, Moonshot of Illicit AI Model Distillation.” February 23, 2026. https://www.bloomberg.com/news/articles/2026-02-23/anthropic-says-deepseek-minimax-distilled-ai-models-for-gains
- CNBC. “Anthropic joins OpenAI in flagging ‘industrial-scale’ distillation campaigns by Chinese AI firms.” February 24, 2026. https://www.cnbc.com/2026/02/24/anthropic-openai-china-firms-distillation-deepseek.html
- South China Morning Post. “Anthropic’s distilling charges against Chinese firms expose AI training grey area.” February 24, 2026. https://www.scmp.com/tech/tech-war/article/3344499/anthropics-distilling-charges-against-chinese-firms-expose-ai-training-grey-area
- Cybersecurity Magazine. “Inside Anthropic’s Claims of Distillation Attack by Alibaba.” June 2026. https://cybermagazine.com/news/inside-anthropics-claims-of-distillation-attack-by-alibaba
- Tom’s Hardware. “Anthropic claims that China’s Alibaba used 25,000 fake accounts and 28.8 million exchanges to illicitly ‘distill’ its Claude model.” June 2026. https://www.tomshardware.com/tech-industry/artificial-intelligence/anthropic-claims-that-chinas-alibaba-illicitly-distilled-its-models-from-april-to-june-2026-says-effort-involved-25-000-fake-accounts-and-28-8-million-exchanges-on-claude
- Forbes. “Anthropic Says Alibaba Used 25,000 Fake Accounts To Distill Claude.” June 26, 2026. https://www.forbes.com/sites/jonmarkman/2026/06/26/anthropic-says-alibaba-used-25000-fake-accounts-to-distill-claude/
- Winston & Strawn LLP. “Is AI Distillation By DeepSeek IP Theft?” March 12, 2025. https://www.winston.com/en/insights-news/is-ai-distillation-by-deepseek-ip-theft
- Rest of World. “OpenAI accuses DeepSeek of malpractice ahead of AI launch.” February 2026. https://restofworld.org/2026/openai-deepseek-distillation-dispute-us-china/
- Forbes (Jon Markman commentary). Op. cit., ref. 7.
- Yicai Global. “Chinese AI Startups MiniMax, DeepSeek, Moonshot Face Distillation Accusations, Peer Zhipu Hit by GPU Crisis.” February 24, 2026. https://www.yicaiglobal.com/news/chinese-ai-startups-minimax-deepseek-moonshot-face-distillation-accusations-peer-zhipu-hit-by-gpu-crisis
- “When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors.” arXiv preprint, 2026. https://arxiv.org/pdf/2604.21255
Note on methodology: figures attributed to Anthropic throughout are the company’s own disclosed estimates, corroborated by independent reporting but not independently audited by a neutral third party. Alibaba has denied the allegations against it; DeepSeek, Moonshot AI, and MiniMax had not issued substantive public rebuttals at the time the underlying reporting was published.




