Following OpenAI, another AI company, Anthropic, has also fired the first shot. Earlier, the official...Release StatementThe report alleges that three Chinese AI startups, including DeepSeek, are illegally extracting Claude's conversational data through large-scale "distillation attacks" to enhance their competitors' own model capabilities.

As competition intensifies in Large Language Modeling (LLM) technology, high-quality training data has become the most valuable asset for various companies. According to reports, Anthropic, the developer of the AI chatbot Claude, issued a strong appeal on its website, specifically naming Chinese AI companies DeepSeek, Moonshot, and MiniMax, accusing them of launching an "industry-scale operation" in an attempt to illegally steal Claude's capabilities.
"Industrial-grade" plagiarism involving 2.4 fake accounts and 1600 million conversations.
In the AI industry, "model distillation" is not a new term. It typically refers to training and improving smaller, weaker models by learning from the outputs of stronger models (such as GPT-4 or Claude). While distillation techniques are legitimate optimization methods under certain licenses, Anthropic emphasizes that these companies' actions have crossed the line and become a malicious attack.
Anthropic points out that these three Chinese AI companies used a total of approximately 24000 fraudulent fake accounts to engage in over 1600 million intensive conversations with Claude. Anthropic believes that these competitors are using Claude as a "shortcut" in their research and development, not only to quickly develop more advanced AI models, but also potentially to bypass the security barriers set by the original manufacturer.
Is the evidence overwhelming? Anthropic vows to upgrade its defenses.
As for how Anthropic caught these "mole"?
The official statement said that by tracking the correlation of IP addresses, comparing metadata requests and infrastructure characteristics, and cross-referencing with other peers in the AI industry who have observed similar anomalous behavior, they have "high confidence" that they can link these distillation attacks to the three specific Chinese companies mentioned above.
In fact, this is not the first case in the industry. As early as the beginning of last year, OpenAI made similar accusations, claiming that competitors were using distillation technology to replicate its model capabilities and subsequently blocked a large number of suspicious accounts. In response, Anthropic promised to fully upgrade its system's defense mechanisms, making future distillation attacks more difficult to execute and easier for the system to detect.
However, this incident also carries a touch of irony: while Anthropic is loudly accusing others of "stealing data," they themselves are currently facing criticism from multiple music publishers.infringement litigationHe was accused of illegally using copyrighted song lyrics to train Claude.
Analysis of viewpoints
This "distillation war" has revealed the most frustrating yet realistic aspect of the current AI industry's development: high-quality training data is running out.
For Chinese AI companies like DeepSeek and Moonshot, which started later or are constrained by the US ban on high-end computing power, training top-tier models from scratch using clean, sifted data from the internet is too time-consuming and computationally expensive. What's the fastest way? It's to directly "ask" the world's smartest AIs (like Claude or ChatGPT) and then feed their own models these well-organized, logically rigorous "golden answers"—this is what's known as "distillation."
Anthropic's anger is entirely understandable. After all, the results of his efforts, which cost hundreds of millions of dollars in computing power, were easily "stolen" by someone else using the API call fees of tens of thousands of accounts.
However, this also reflects a kind of "snake" ecosystem in the current AI market: tech giants scrape the entire internet's copyrighted work without authorization to train underlying models, while startups then scrape the tech giants' models without authorization to train their own smaller models. Until truly global AI data copyright regulations are established, this "you copy me, I copy you" battle will likely only intensify.


