OpenAI accuses DeepSeek of model distillation in memo to Congress
OpenAI told US lawmakers that DeepSeek used obfuscated routers to scrape its models. Anthropic followed with its own accusations.
Whether you're looking for an angel investor, a growth advisor, or just want to connect — I'm always open to great ideas.
Get in TouchAI, startups & growth insights. No spam.
TL;DR: On February 12, 2026, OpenAI sent a memo to the U.S. House Select Committee on China accusing DeepSeek of systematically distilling its frontier models through obfuscated third-party routers and programmatic output extraction. Twelve days later, Anthropic followed with its own allegations that DeepSeek, Moonshot AI, and MiniMax used over 24,000 fake accounts to generate more than 16 million exchanges with Claude. No independent technical verification exists for either claim, and DeepSeek has not responded publicly.
On February 12, 2026, OpenAI released a memo to the U.S. House Select Committee on China accusing its Chinese competitor DeepSeek of "ongoing efforts to free-ride on the capabilities developed by OpenAI and other US frontier labs."
The memo was not subtle. OpenAI described what it characterized as a coordinated campaign by DeepSeek employees to extract outputs from OpenAI's models through methods designed to avoid detection. The company said it had observed "accounts associated with DeepSeek employees developing methods to circumvent OpenAI's access restrictions and access models through obfuscated third-party routers and other ways that mask their source."
The timing matters. OpenAI sent this memo while Congress was actively debating AI export control legislation. Multiple bills are moving through the 119th Congress, including the Decoupling America's Artificial Intelligence Capabilities from China Act (S.321), the No Advanced Chips for the CCP Act (H.R. 5022), and the China AI Power Report Act (H.R. 6275). The House Foreign Affairs Committee voted 42-2 to advance legislation restricting AI chip exports to China.
OpenAI was not just informing Congress of a security issue. It was lobbying for policy action at the exact moment that policy was being written.
"This is part of the CCP's playbook: steal, copy, and kill." -- Rep. John Moolenaar, Republican chair of the House China committee
Representative John Moolenaar's statement came in direct response to the memo. It set the political tone for the weeks that followed.
Before judging the severity of OpenAI's claims, you need to understand what distillation means in the AI context, because the word carries different weight depending on who is using it.
Model distillation is a well-established machine learning technique originally described by Geoffrey Hinton in 2015. The core idea is straightforward. You have a large, expensive model (the "teacher") and you want to create a smaller, cheaper model (the "student") that behaves similarly. You do this by training the student not on raw data, but on the teacher's outputs.
The technical process works like this. You feed inputs to the teacher model and collect its responses, including not just the final answer but the probability distributions across possible answers (called "soft targets" or "logits"). These soft targets contain richer information than simple right-or-wrong labels. They encode the teacher's confidence levels, its uncertainty, and the relationships it sees between concepts. The student model trains on these outputs, effectively learning to mimic the teacher's reasoning patterns.
This is not inherently malicious. OpenAI itself offers supervised fine-tuning through distillation as a product feature. Meta's Llama models have been distilled into smaller variants. Google has distilled large Gemini models into smaller ones. The technique is standard practice across the industry.
What makes the DeepSeek accusation different, according to OpenAI, is scale, intent, and method. OpenAI is alleging that DeepSeek employees systematically scraped its models by routing millions of API calls through obfuscated intermediaries, violating OpenAI's terms of service and, potentially, U.S. intellectual property law.
The distinction matters. Reading a textbook to learn is one thing. Photocopying the textbook and selling it under a different cover is another. OpenAI's Chris Lehane, the company's Chief Global Affairs Officer, used a similar analogy, comparing OpenAI's own training methods to "reading a library book and learning from it," while characterizing DeepSeek's approach as "putting a new cover on a library book and selling it as your own."
That analogy is doing a lot of work. Whether it holds up legally and technically remains unproven.
The memo outlined specific technical behaviors OpenAI claims to have observed. These fall into three categories.
First, obfuscated routing. OpenAI said DeepSeek-associated accounts accessed its models through third-party intermediaries designed to hide the true origin of the requests. Instead of connecting directly to OpenAI's API, which would have exposed DeepSeek's identity, these accounts allegedly routed traffic through proxy services that masked the source.
Second, programmatic extraction. The memo stated that "DeepSeek employees developed code to access U.S. AI models and obtain outputs for distillation in programmatic ways." This means automated scripts were allegedly sending large volumes of carefully crafted prompts and capturing the responses at scale. Not casual browsing, but industrial-level data collection.
Third, cross-platform targeting. OpenAI wrote that it "believes that DeepSeek also uses third-party routers to access frontier models from other U.S. labs." This suggests the alleged distillation campaign was not limited to OpenAI's models alone.
What OpenAI did not provide is independent technical evidence. No network logs were attached to the public memo. No forensic analysis from a third party was cited. The claims rest entirely on OpenAI's internal investigation.
This matters because OpenAI has a financial incentive to frame DeepSeek's success as theft rather than legitimate innovation. If DeepSeek genuinely achieved GPT-4 level performance at a fraction of the cost through clever engineering, that is an existential threat to OpenAI's business model. If DeepSeek cheated to get there, that is a regulatory problem with a regulatory solution.
Twelve days after OpenAI's memo, Anthropic published its own accusations. On February 24, 2026, Anthropic released a report titled "Detecting and Preventing Distillation Attacks", naming three Chinese AI companies: DeepSeek, Moonshot AI, and MiniMax.
Anthropic's claims were more specific in their numbers. The company alleged that the three firms collectively used over 24,000 fraudulently created accounts to generate more than 16 million exchanges with Claude.
The breakdown by company was revealing:
"We found concerted distillation attack campaigns from three organizations." -- Anthropic
The scale difference is notable. MiniMax's alleged 13 million exchanges dwarf DeepSeek's 150,000. Yet DeepSeek is the name that dominates headlines, largely because of the geopolitical stakes attached to its success and the fact that OpenAI had already primed the public narrative two weeks earlier.
Anthropic framed the issue in national security terms, telling CNBC that when capabilities are distilled without safeguards, the resulting models can be more easily misused. This echoed OpenAI's earlier argument almost word for word.
Neither DeepSeek nor MiniMax nor Moonshot AI has responded publicly to these allegations as of publication.
You cannot understand OpenAI's memo without understanding why DeepSeek's R1 model created panic across Silicon Valley when it launched in January 2025.
DeepSeek-R1 is a reasoning model that, according to multiple independent benchmarks, performs at roughly the same level as OpenAI's o1 model. It achieved 79.8% pass@1 on the American Invitational Mathematics Examination (AIME) and 97.3% pass@1 on MATH-500. It reached a 2,029 Elo rating on Codeforces-style programming challenges. These are serious numbers.
The cost gap was what broke people's brains. DeepSeek reported that the reinforcement learning training phase for R1 cost approximately $294,000 in GPU usage. The full pre-training run (on the underlying DeepSeek V3 model) used 2,048 H800 GPUs and cost an estimated $5.3 million. For context, OpenAI's GPT-4 training is widely estimated to have cost over $100 million.
On the inference side, DeepSeek's API pricing was $0.55 per million input tokens and $2.19 per million output tokens. Sam Altman himself acknowledged that DeepSeek runs 20 to 50 times cheaper than OpenAI's comparable model. VentureBeat reported this as a 95% cost reduction.
DeepSeek's model is also open-source, meaning anyone can download and run it. This is the opposite of OpenAI's approach.
The implications were immediate. If a Chinese lab could match American frontier models at a fraction of the cost, the billions of dollars in compute infrastructure that OpenAI, Anthropic, and Google were spending might not be the competitive moat they assumed. That is not just a technical story. It is a business story.
| Metric | DeepSeek R1 | OpenAI o1 |
|---|---|---|
| AIME pass@1 | 79.8% | ~79% |
| MATH-500 pass@1 | 97.3% | ~96% |
| Codeforces Elo | 2,029 | ~1,800 |
| API input cost (per 1M tokens) | $0.55 | ~$15.00 |
| API output cost (per 1M tokens) | $2.19 | ~$60.00 |
| Open source | Yes | No |
| Estimated training cost | ~$5.3M | $100M+ |
OpenAI's memo did not focus exclusively on intellectual property. A significant portion dealt with safety implications.
The argument goes like this: when a model is distilled, the safety guardrails built into the original model are often lost in translation. The student model learns the teacher's capabilities but not its carefully calibrated refusals, content filters, or harm-reduction behaviors. The result, OpenAI claims, is a model that can do everything the original can do but without the restrictions.
OpenAI specifically cited "high-risk areas such as biology and chemistry" as domains where stripped-down distilled models could enable misuse. The memo warned that "when capabilities are copied through distillation, safeguards often fall to the wayside, enabling more widespread misuse."
This is not a hypothetical concern. Independent researchers have documented that many open-source models derived from frontier systems have weaker safety guardrails than their parent models. But it is also not unique to distillation. Any model can be fine-tuned to remove safety restrictions, regardless of how it was originally trained.
The safety argument serves a dual purpose. It reframes what could be seen as a business dispute into a national security issue. When a company says "they stole our IP," that is a commercial complaint. When a company says "they stole our IP and now people could use it to make bioweapons," that is a Congressional hearing.
"Since DeepSeek and many other Chinese models don't carry a monthly subscription cost, the prevalence of distillation could pose a business threat to American companies." -- Foundation for Defense of Democracies analysis
The legislative response has been swift and bipartisan.
The 119th Congress has multiple AI-related bills in various stages of advancement. The Decoupling America's Artificial Intelligence Capabilities from China Act (S.321) would prohibit the importation of AI technology developed in China. The No Advanced Chips for the CCP Act (H.R. 5022) requires Congressional approval for advanced AI chip exports to China. The China AI Power Report Act (H.R. 6275) demands an assessment of how effective current export controls actually are at restricting China's access to AI technology.
The GAIN AI Act would require companies to give U.S. businesses first priority in acquiring advanced AI chips before any exports. The STRIDE Act and the AI Overwatch Act add additional layers of oversight.
On the executive side, policy has shifted. In January 2026, the Bureau of Industry and Security revised its export licensing policy for advanced AI chips. Where there was previously a presumption of denial for exports to mainland China, the new rule allows case-by-case review under specific conditions. This is a loosening, not a tightening, and it happened at the same time OpenAI was warning Congress about Chinese AI theft.
The tension is real. The U.S. wants to maintain AI dominance. It also wants to sell chips. These goals sometimes conflict.
There are several questions that neither OpenAI's memo nor Anthropic's report addresses satisfactorily.
First, where is the evidence? Both companies have described what they observed internally, but neither has submitted forensic evidence to a court or an independent auditor. OpenAI's claims are assertions, not findings of fact. No judge has ruled. No independent technical review has been published. As multiple security researchers have noted, "the boundary between legitimate use and adversarial exploitation is often blurry."
Second, how is this different from what OpenAI does? OpenAI trained its models on massive quantities of publicly available text from the internet, much of which was created by individuals and companies who did not consent to their work being used that way. The New York Times is currently suing OpenAI over exactly this issue. When Chris Lehane draws a line between "reading a library book" and "putting a new cover on it," he is drawing a line that many content creators would argue OpenAI itself has crossed.
Third, is OpenAI's motivation informational or commercial? The memo was sent to a committee that has shown consistent hawkishness on China. The timing aligned with active legislative discussions about export controls. OpenAI stands to benefit directly from any policy that restricts Chinese competitors' access to American AI capabilities.
Fourth, does distillation even explain DeepSeek's success? DeepSeek's R1 model uses a Mixture of Experts (MoE) architecture and pure reinforcement learning. These are genuine architectural innovations. Even if some distillation occurred, it would not explain the cost efficiency gains that come from DeepSeek's engineering decisions around GPU utilization, mixed-precision training, and model architecture.
Fifth, what is DeepSeek's response? As of publication, DeepSeek has not publicly addressed either OpenAI's or Anthropic's accusations. The silence is notable, though it could reflect anything from guilt to indifference to a strategic decision to avoid amplifying the narrative.
"The rapid advance of AI means that distillation attacks are becoming a serious threat." -- The Register
Here is a direct comparison of what each company has alleged:
| Dimension | OpenAI (Feb 12) | Anthropic (Feb 24) |
|---|---|---|
| Target committee | House Select Committee on China | Public report |
| Companies accused | DeepSeek | DeepSeek, Moonshot AI, MiniMax |
| Method described | Obfuscated routers, programmatic extraction | 24,000+ fake accounts |
| Volume cited | Not specified | 16 million+ exchanges total |
| DeepSeek volume | Not specified | 150,000 exchanges |
| Largest alleged offender | DeepSeek (sole target) | MiniMax (13M+ exchanges) |
| Independent verification | None | None |
| Legal action filed | No | No |
| DeepSeek response | None | None |
| Safety claims | Yes, bio/chem risks | Yes, guardrail stripping |
The pattern is clear. Both companies made serious allegations. Neither provided evidence that could be independently verified. Neither filed legal action. And the accused party has said nothing.
The distillation debate is about more than one company's models. It raises fundamental questions about how the AI industry works.
If distillation from commercial APIs is widespread, it means that any company offering API access to a frontier model is simultaneously providing training data to competitors. This creates a paradox. API access is how AI companies make money. But API access is also how competitors can extract the knowledge embedded in those models.
OpenAI's proposed solution is more restrictive policy and export controls. But there is a technical tension here. Rate limiting, watermarking, and usage monitoring can detect some distillation attempts, but they cannot prevent a determined adversary from accessing outputs through distributed accounts and obfuscated routing. The arms race between extraction and detection is ongoing, and it is not clear that detection will win.
For developers and businesses using AI APIs, the immediate impact is minimal. But the longer-term implications could be significant. If U.S. policy restricts Chinese access to American AI models, that could fragment the global AI ecosystem. Companies operating across borders could face compliance complexity. Open-source AI development, which benefits from global collaboration, could be caught in the crossfire.
The fundamental question is whether AI model outputs should be treated as intellectual property in the same way that software code or creative works are. Current law is unclear on this point. OpenAI's memo to Congress is, among other things, an attempt to shape that legal framework before courts or regulators settle the question themselves.
Model distillation is training a smaller, cheaper AI model to mimic a larger, more expensive one by feeding it the larger model's outputs. Think of it as a student learning by studying a teacher's answers to thousands of test questions, rather than studying the original textbooks. The technique was formalized by Geoffrey Hinton in 2015 and is widely used across the AI industry.
No. OpenAI described observations from its internal monitoring in a memo to Congress. It stated that accounts linked to DeepSeek employees used obfuscated routers to access its models. But no independent forensic evidence has been published, no court has ruled on the claim, and no third-party audit has been conducted. These are allegations, not findings.
As of February 27, 2026, DeepSeek has not issued any public response to either OpenAI's or Anthropic's allegations. Neither Moonshot AI nor MiniMax has responded to Anthropic's claims either.
Anthropic alleged that DeepSeek generated more than 150,000 exchanges with Claude through fraudulently created accounts. However, Anthropic's report focused more heavily on MiniMax (over 13 million exchanges) and Moonshot AI (over 3.4 million exchanges). In total, Anthropic claimed over 16 million exchanges from approximately 24,000 fake accounts across all three Chinese firms.
The legal status is ambiguous. OpenAI's terms of service prohibit using outputs to train competing models, which would make distillation a breach of contract. Whether it constitutes intellectual property theft under U.S. law is untested in court. Congress is considering legislation that could clarify this, but no specific law currently makes model distillation a criminal offense.
DeepSeek reported approximately $294,000 for the reinforcement learning phase and an estimated $5.3 million for the full pre-training run. GPT-4 is widely estimated to have cost over $100 million. DeepSeek's API pricing is roughly 27 times cheaper than OpenAI's comparable model at $0.55 per million input tokens versus approximately $15 per million.
The committee is the primary Congressional body focused on U.S.-China competition, including technology and AI. OpenAI framed the distillation issue as a national security concern, not just a commercial dispute, which made the China committee the natural audience. The timing coincided with active legislative debates about AI export controls.
Yes, it is technically plausible. DeepSeek R1 uses a Mixture of Experts architecture and pure reinforcement learning, both of which are genuine innovations that contribute to its cost efficiency. Multiple AI researchers have noted that the architectural choices alone could explain much of DeepSeek's performance. Distillation and independent innovation are not mutually exclusive.
The 119th Congress has multiple relevant bills: the Decoupling America's Artificial Intelligence Capabilities from China Act (S.321), the No Advanced Chips for the CCP Act (H.R. 5022), the China AI Power Report Act (H.R. 6275), the GAIN AI Act, the STRIDE Act, and the AI Overwatch Act. These range from import bans on Chinese AI to enhanced export controls on chips.
Not directly in the short term. But if distillation concerns lead to more restrictive API policies, such as stricter rate limiting, enhanced monitoring, or reduced output detail, developers could see changes in how they interact with frontier models. Companies building on these APIs should monitor both the policy environment and any updates to terms of service.
The Office of Personnel Management updated its AI disclosure inventory to remove Anthropic's Claude and add xAI's Grok and OpenAI's Codex as federal agencies begin complying with Trump's directive to phase out Anthropic products.
OpenAI releases GPT-5.3 Instant to fix ChatGPT's overly cautious tone while cutting hallucinations by 26.8 percent, rolling out to all users and API developers.
OpenAI releases GPT-5.4 with major benchmark improvements, enhanced reasoning, and reduced hallucinations. Available now to ChatGPT Plus and API users.