DeepSeek V4 locks out Nvidia and AMD, hands Huawei a multi-week head start
DeepSeek denied US chipmakers early access to its V4 model while giving Huawei priority optimization time. Export control allegations add fuel.
Whether you're looking for an angel investor, a growth advisor, or just want to connect — I'm always open to great ideas.
Get in TouchAI, startups & growth insights. No spam.
TL;DR: DeepSeek has denied Nvidia and AMD pre-release access to its upcoming V4 flagship model, giving Huawei and other Chinese chipmakers a multi-week optimization head start. A senior Trump administration official alleges the model was trained on Nvidia Blackwell chips inside mainland China, potentially violating US export controls. DeepSeek may attempt to scrub that evidence and claim the model ran on Huawei hardware instead.
On February 25, 2026, Reuters reported that DeepSeek has withheld pre-release access to its upcoming V4 flagship model from Nvidia, AMD, and other US chipmakers. Instead, the company gave Huawei and other Chinese chip manufacturers a multi-week head start to optimize their hardware for the new model.
The V4 model itself is a significant step up from DeepSeek's previous work. Early details point to a 1-trillion parameter Mixture-of-Experts (MoE) architecture with a 1-million-token context window. It introduces three architectural innovations: Manifold-Constrained Hyper-Connections (mHC), Engram conditional memory, and DeepSeek Sparse Attention. Internal benchmarks reportedly claim 80%+ SWE-bench performance at 10x to 40x lower inference costs compared to Western competitors.
The model can reportedly run on dual RTX 4090 GPUs, and DeepSeek plans to open-source the weights under Apache 2.0 licensing, maintaining the same open approach that made its earlier models so disruptive.
But the story here is not the model's architecture. It is who gets to optimize for it first.
In the AI industry, there is an unwritten but universally followed practice. When a major lab prepares to release a new model, it shares pre-release versions with all major chipmakers. Nvidia, AMD, Intel, Qualcomm, whoever makes hardware that developers might use. The reason is practical. Chipmakers need time to tune their drivers, compilers, and software stacks so the model runs efficiently on their hardware from day one.
DeepSeek itself used to do exactly this. The company previously worked closely with Nvidia's technical teams to ensure its models performed well on Nvidia GPUs. That relationship is now severed, at least for V4.
"The move is likely part of a broader strategy by the Chinese government to try to keep US hardware and models disadvantaged." -- Reuters
By giving Huawei a weeks-long head start, DeepSeek ensures that when V4 launches publicly, it will run best on Chinese hardware. Nvidia and AMD engineers will be starting from scratch, reverse-engineering optimizations that Huawei has already locked in. For any organization choosing between Huawei Ascend chips and Nvidia GPUs to run DeepSeek V4, the performance gap at launch could tip the decision.
This is not just a technical optimization question. It is a market access strategy wrapped in a model release.
Here is where the story takes a much sharper turn. A senior Trump administration official told Reuters that DeepSeek's V4 model was trained on Nvidia's Blackwell chips, specifically on a cluster located inside mainland China.
If true, this would be a direct violation of US export controls. Nvidia's Blackwell architecture (the B200 and related products) is explicitly restricted from export to China under rules that have been tightened repeatedly since October 2022. No one disputes this. The chips are on the controlled list. Selling them to Chinese entities, or allowing them to be used for training inside China, violates current US law.
The allegation is not vague. It specifically names the chip architecture (Blackwell), the location (mainland China), and the purpose (training). That level of specificity from a named US official, even an unnamed "senior" one, signals that the administration believes it has evidence.
Nvidia has not been accused of selling the chips directly. The implication is that DeepSeek obtained Blackwell hardware through intermediaries, diversion networks, or pre-existing stockpiles that were smuggled before enforcement tightened. This matches a pattern that US officials have been warning about for over two years. Despite increasingly strict controls, advanced Nvidia chips keep showing up in China through shell companies, resellers in Southeast Asia and the Middle East, and other gray-market channels.
"DeepSeek trained latest AI model on Nvidia Blackwell chips despite US ban." -- Investing.com
The same US official made a second, more provocative claim. DeepSeek may attempt to scrub technical indicators from the V4 model that would reveal its reliance on American AI chips. The plan, reportedly, is to then publicly claim the model was trained on Huawei hardware instead.
This is technically plausible. AI models do contain artifacts from their training environment. Things like numerical precision patterns, memory access optimizations, and specific kernel signatures that experienced engineers can use to identify the underlying hardware. However, these traces can also be obfuscated with enough effort.
The timing of the Huawei head start now looks different in this light. If DeepSeek's V4 launches with optimized Huawei performance and scrubbed Nvidia artifacts, the company can plausibly argue it was Huawei-native from the beginning. The multi-week optimization window gives Huawei engineers time to fine-tune their Ascend stack so the model runs convincingly well on domestic hardware.
Is this what is actually happening? We do not know for certain. What we do know is that a senior US government official believes this is the plan and has said so on the record to Reuters. That alone is significant, regardless of whether the allegation can be proven.
To understand why V4 matters so much, you need to understand what DeepSeek has already accomplished.
DeepSeek-V3, released in late 2024, was a 671-billion parameter MoE model with 37 billion parameters activated per token. It used Multi-head Latent Attention (MLA) and was trained on 14.8 trillion tokens using 2.788 million H800 GPU-hours. The reported training cost was roughly $5.5 million. That number stunned the industry. At a time when frontier model training runs were costing $100 million or more, DeepSeek achieved competitive performance for a fraction of the price.
DeepSeek-R1, the reasoning model released in January 2025, caused an even bigger shock. On January 27, 2025, Nvidia's stock dropped 17% in a single day, erasing nearly $589 billion in market value. It was the largest single-day market cap loss in stock market history, more than doubling Meta's previous record of $240 billion. The tech-heavy Nasdaq fell 3.1%. The S&P 500 dropped 1.5%.
The panic was not about one model being slightly better than another. It was about the implication that Chinese labs could build frontier-class AI at dramatically lower cost, potentially undermining the entire investment thesis behind hundreds of billions in Western AI infrastructure spending.
DeepSeek's models have been downloaded more than 75 million times on Hugging Face since the company emerged in January 2025. That usage rate matters. When a model gets that much adoption, the hardware it runs best on gains a significant market advantage.
| Model | Parameters | Training cost | Key impact |
|---|---|---|---|
| DeepSeek-V3 | 671B (37B active) | ~$5.5M | Proved frontier AI possible at low cost |
| DeepSeek-R1 | Not disclosed | Not disclosed | Triggered $589B Nvidia selloff |
| DeepSeek-V4 | ~1T (MoE) | Not yet disclosed | Locks out US chipmakers from optimization |
The other half of this story is Huawei's readiness to capitalize on the opportunity.
Huawei's Ascend 910C is its current flagship AI accelerator. It combines two 910B dies on a single board, delivering up to 800 TFLOPS in FP16 mode with 128GB of HBM3 memory and 3.2 TB/s of bandwidth. It is manufactured using SMIC's 7nm (N+2) process with DUV lithography, not the more advanced EUV that TSMC uses for Nvidia's chips.
The chip has real limitations. According to Mizuho Securities, the estimated yield rate for the Ascend 910C is only around 30%, which severely constrains production volume. DeepSeek's own research suggests the 910C delivers approximately 60% of Nvidia H100 inference performance. That is a meaningful gap.
But "60% of an H100" is not zero. For many workloads, especially inference at scale, that performance level is workable. And Huawei targeted 700,000 Ascend chip shipments in 2025, with the 910D already in development.
Here is the strategic calculation. If DeepSeek optimizes V4 specifically for Ascend hardware, that 60% raw performance gap narrows significantly for this particular model. Software optimization can close hardware gaps, sometimes dramatically. By the time Nvidia engineers catch up on V4 optimization, Huawei's ecosystem will already have production deployments running.
"By prioritizing Huawei's Ascend chips for optimization, DeepSeek is accelerating the development of a parallel software ecosystem that reduces long-term dependency on US technology." -- The China Academy
This is not just about one model. It is about building a self-sustaining Chinese AI hardware and software stack that does not need American components at all.
The DeepSeek V4 situation sits within a broader conflict that has been escalating for over three years. Here is the key timeline.
October 2022. The Commerce Department's Bureau of Industry and Security (BIS) rolled out the first major export controls on advanced AI chips. Nvidia's A100 and H100 were restricted. The goal was to prevent China's military and surveillance agencies from accessing cutting-edge processing power.
Late 2022 to mid 2023. Nvidia designed custom chips for the Chinese market. The A800 and H800 reduced NVLink bandwidth from 600 GB/s to 400 GB/s to comply with the rules. Chinese companies bought them in volume.
October 2023. BIS updated and expanded the controls, closing the workaround. The rules now covered a broader set of chips and semiconductor manufacturing equipment. Nvidia's China-specific products were effectively blocked.
December 2024. BIS added 140 companies to the Entity List and expanded the Foreign Direct Product Rule (FDPR). High-bandwidth memory restrictions were introduced for the first time.
January 2025. The outgoing Biden administration issued the "AI Diffusion Rule," establishing global performance thresholds that blocked sales of flagship GPUs like the H100 and H200 to China.
April 2025. The Trump administration banned even chips deemed "compliant" under previous rules, including Nvidia's H20, as part of a broader tariff escalation.
July 2025. The Trump administration reversed course on the H20, announcing that export licenses would be approved for the H20 and AMD's MI308.
February 2026. Reports emerge that DeepSeek trained V4 on Blackwell chips inside China despite all of the above restrictions.
The pattern is clear. Every time the US tightens controls, China finds workarounds. Every time China demonstrates progress, the US tightens further. The cycle has not stopped advanced Chinese AI development. It has changed where and how the development happens.
Nvidia reported $215.9 billion in fiscal year 2026 revenue, with data center sales accounting for $193.7 billion of that. The company guided $78 billion for Q1 FY27. By any measure, Nvidia is printing money.
But the DeepSeek V4 situation introduces a risk that financial analysts are only beginning to price in. If Chinese AI labs increasingly optimize for domestic hardware first and American hardware second (or not at all), the addressable market for Nvidia in China shrinks over time. China was once Nvidia's second-largest market. It is already much smaller due to export controls. This trend could accelerate.
The market impact of the V4 lockout specifically is difficult to quantify. Nvidia's stock has been volatile around DeepSeek news before. The R1 launch caused a $589 billion single-day wipeout. But the company recovered and went on to post record earnings.
The longer-term concern is different. It is about ecosystem lock-in. If the most popular open-source AI models are optimized for Huawei hardware first, developers in non-Western markets, think Southeast Asia, the Middle East, Africa, and parts of Latin America, may increasingly choose Chinese chips over American ones. Not because the hardware is better, but because the software runs better on it.
That is a strategic problem that no amount of quarterly earnings can address.
DeepSeek's decision signals something larger than one company choosing sides in a chip war. It signals that AI model distribution is becoming a geopolitical tool.
When OpenAI releases GPT-5 or Anthropic ships Claude 4, those models are optimized for Nvidia hardware by default. Not because of any government mandate, but because that is what Western AI labs use. The optimization happens naturally through the development process.
DeepSeek is doing the same thing in reverse, but with an explicit strategic purpose. By optimizing V4 for Huawei Ascend first, DeepSeek is creating a pull factor toward Chinese hardware. Every developer who downloads V4 and runs it on Ascend chips contributes to a growing ecosystem of Chinese AI infrastructure.
This matters because DeepSeek's models are open-source and massively popular. With 75 million downloads on Hugging Face, DeepSeek has a distribution footprint that rivals many Western AI companies. If V4 continues that trajectory, the hardware it favors becomes the default for a significant portion of global AI development.
The AI chip market is projected to be part of a semiconductor industry approaching $1 trillion in total addressable market by 2026, with AI chips and high-bandwidth memory driving nearly 50% of industry revenue. The question of which chips run the most popular models is a question worth hundreds of billions of dollars.
We are watching, in real time, the formation of two parallel AI ecosystems. One built on American hardware and Western models. One built on Chinese hardware and Chinese models. The V4 lockout is not the beginning of this split, but it may be the moment it became irreversible.
DeepSeek denied Nvidia, AMD, and other US chipmakers pre-release access to its upcoming V4 flagship model. Instead, it gave Huawei and other Chinese chip manufacturers a multi-week head start to optimize their processors for V4. This is a break from standard industry practice where all major chipmakers receive early access simultaneously.
As of late February 2026, V4 has not been publicly released. Pre-release versions have been shared with Chinese chipmakers for optimization, but the general public release date has not been confirmed. Early reports suggest a 1-trillion parameter MoE architecture with 1-million-token context windows.
A senior Trump administration official told Reuters that DeepSeek trained V4 on Nvidia Blackwell chips inside mainland China. This has not been independently verified. If true, it would constitute a violation of US export controls. DeepSeek has not publicly confirmed or denied the claim.
Nvidia's Blackwell architecture is on the US export control list and cannot be legally sold to or used in China for AI training. If the training allegation is proven, it could trigger enforcement actions against intermediaries who supplied the chips and potentially against DeepSeek itself, though extraterritorial enforcement against Chinese entities is notoriously difficult.
The Ascend 910C delivers approximately 800 TFLOPS in FP16 with 128GB HBM3 and 3.2 TB/s bandwidth. DeepSeek's own research suggests it achieves about 60% of Nvidia H100 inference performance. However, it has only a 30% manufacturing yield rate at SMIC's 7nm process, which limits supply. It is competitive for certain workloads but still trails Nvidia's current-generation hardware in raw performance.
Chipmakers use pre-release models to optimize drivers, compilers, memory management, and execution pipelines for specific model architectures. Without that access, their hardware runs the model with generic settings rather than tuned ones. The performance difference can be 20-40% or more, which directly impacts customer purchasing decisions.
It is technically possible but difficult. AI models carry artifacts from their training environment, including numerical precision patterns and kernel-level signatures. Expert forensic analysis can identify these traces, but they can also be obfuscated. The claim comes from a US government official, suggesting intelligence agencies may have their own methods for detecting hardware provenance.
DeepSeek-R1's January 2025 launch triggered a 17% drop in Nvidia's stock, erasing $589 billion in market value in a single day. The Nasdaq fell 3.1% and the S&P 500 dropped 1.5%. DeepSeek-V3 demonstrated that frontier-class AI could be trained for approximately $5.5 million, a fraction of what Western labs were spending.
China is building a parallel AI ecosystem that reduces dependency on American technology. This includes Huawei's Ascend chip line, SMIC's foundry capabilities, and now software ecosystem advantages through model optimization. The strategy aims for self-sufficiency in AI compute even under maximum US export restrictions.
Yes. DeepSeek plans to release V4 weights under Apache 2.0 licensing, making them freely available worldwide. However, the hardware optimization advantage means the model will run best on Huawei hardware at launch. Global developers can still run V4 on Nvidia GPUs, but they may experience suboptimal performance until Nvidia's engineering teams complete their own optimizations, which could take weeks or months.
ByteDance is deploying its own AI chips at scale — targeting 100,000 units by EOY 2026 — while launching Seedream 5, an image generation model that outperforms Google's best. China's biggest AI sovereignty move yet.
DeepSeek V4 launches with 1 trillion total parameters, native multimodal capabilities, a 1M token context window, and pricing 50x cheaper than GPT-5.2 — all open-source and optimized for Huawei chips.
The US government is considering capping NVIDIA H200 chip exports at 75,000 per Chinese customer, a granular new approach to export controls as Alibaba and ByteDance seek 200,000+ units each.