Anthropic Accuses 3 Chinese Companies Of Mass AI Data Harvesting, Warns 'Window To Act' Narrow

Anthropic accuses three Chinese companies of illicitly harvesting data from its Claude AI using fraudulent accounts and distillation techniques.

Advertisement
Read Time: 3 mins
Anthropic accuses three Chinese firms of harvesting data via 24,000 fraudulent accounts.
Quick Read
Summary is AI-generated, newsroom-reviewed
  • Anthropic accused three Chinese firms of harvesting data from its Claude AI models via fake accounts
  • About 24,000 fraudulent accounts generated over 16 million conversations to train rival chatbots
  • The campaigns targeted Claude's key capabilities like reasoning, tool use, and coding functions
Did our AI summary help?
Let us know.

Anthropic has accused three Chinese companies of harvesting large amounts of data from its Claude artificial intelligence (AI) models through large-scale 'distillation' campaigns. In a blog post on Monday (Feb 23), the San Francisco-based company accused DeepSeek, Moonshot and MiniMax of using approximately 24,000 fraudulent accounts to generate over 16 million conversations with its Claude chatbot that could be used to teach skills to their own chatbots.

Describing the activity as a breach of its terms of service and regional access restrictions, Anthropic said it detected synchronised usage patterns across multiple accounts, including shared payment methods and coordinated timings.

"The volume, structure, and focus of the prompts were distinct from normal usage patterns, reflecting deliberate capability extraction rather than legitimate use," Anthropic stated, adding: "Each campaign targeted Claude's most differentiated capabilities: agentic reasoning, tool use, and coding."

DeepSeek was involved in over 150,000 exchanges, primarily focusing on reinforcement learning and complex reasoning. Meanwhile, Moonshot and MiniMax saw significantly higher volumes, recording 3.4 million and 13 million exchanges respectively, as they focused on coding, data analysis, and tool orchestration.

One notable example involved a DeepSeek model prompting Claude to reveal its internal logic. By forcing Claude to articulate its reasoning step-by-step, the Chinese chatbot effectively extracted chain-of-thought training data at scale.

"We also observed tasks in which Claude was used to generate censorship-safe alternatives to politically sensitive queries like questions about dissidents, party leaders, or authoritarianism, likely in order to train DeepSeek's own models to steer conversations away from censored topics," it added.

Also Read | Is AI Discriminatory? MIT Study Finds Chatbots Refuse To Answer Less Educated, Non-US Users

What Is Distillation?

The Chinese AI models used a technique called 'distillation' to harvest the data. The process involves training a less capable model on the outputs of a stronger one. While distillation is a widely used, legitimate training method, Anthropic said it can also be used for illicit purposes.

Advertisement

Competitors can use it to acquire powerful capabilities from other labs in a fraction of the time and at a fraction of the cost that it would take to develop them independently.

'Window To Act Narrow'

Anthropic warned that such illicit campaigns were growing in "intensity and sophistication" and that the window to act was narrowing.

"The window to act is narrow, and the threat extends beyond any single company or region. Addressing it will require rapid, coordinated action among industry players, policymakers, and the global AI community."

Anthropic is not the only leading AI developer to highlight the threat posed by Chinese entities scraping US proprietary technology. In a memo to the House Select Committee on China last week, OpenAI alleged that Chinese firms have extracted extensive data from ChatGPT to bolster their own models.

Advertisement
Featured Video Of The Day
"They Tried To Kidnap Me": Ex-Army Colonel's Chilling Account Of Gurgaon Road Rage
Topics mentioned in this article