1. The 75,000-chip cap: what is actually being proposed 2. Why Alibaba and ByteDance want 200,000+ chips each 3. Zero H200s shipped to China since January: the gap between approval and delivery 4. AMD MI325X counts toward the same cap: what that means 5. The 1 million chip ceiling: a national quota for Chinese AI compute 6. Export control escalation timeline: from H100 to per-customer limits 7. What 75,000 H200s actually buys in AI training capacity 8. China's response: domestic alternatives and the Huawei play 9. What this means for NVIDIA's revenue 10. The geopolitical logic behind per-customer limits 11. How Chinese labs are likely to respond 12. Frequently asked questions ---

US weighs 75,000-chip cap per customer on NVIDIA H200 expor…

Q: Why Alibaba and ByteDance want 200,000+ chips each

The gap between 75,000 and 200,000+ is not a negotiating position. It reflects genuine demand driven by the scale of AI training workloads at frontier capability levels. Both Alibaba and ByteDance are building AI systems that compete directly with frontier Western models. Alibaba's Qwen series has consistently benchmarked at or near GPT-4 level capability across multiple evaluations. ByteDance's Doubao and underlying Doubao-pro-32k models serve hundreds of millions of users on TikTok and Douyin. Both companies are training the next generation of models, and training at frontier scale requires compute that scales with model parameter count, context length, and training data volume. The math on why 200,000 chips is not an arbitrary number: training a frontier-scale model in the 700B parameter range with a trillion-token dataset on H200 clusters requires roughly 100,000–180,000 GPU-hours of continuous operation. At 100% utilization — which no real cluster achieves — that requires thousands of chips running for months. A cluster of 200,000 H200s represents roughly 16 exaflops of BF16 compute, comparable to what the largest Western training runs are using in 2025–2026. Both companies are also building inference infrastructure alongside training clusters. Serving a model at Alibaba's or ByteDance's user volume requires sustained GPU capacity measured in the tens of thousands of chips just for inference. The 200,000-chip requests are not padding — they reflect realistic infrastructure planning for companies competing at global AI scale. The 75,000-chip cap, if enforced, cuts that ambition roughly in half. That is precisely the intent. ---

Q: What is the 75,000-chip cap and is it final?

The 75,000-chip cap is a proposed per-customer limit on NVIDIA H200 GPU exports to Chinese buyers, under active deliberation inside the Commerce Department's Bureau of Industry and Security as of early March 2026. It has not been codified into regulation. The administration could finalize it, modify the number, or restructure the framework entirely. The direction — toward per-customer compute quotas rather than categorical bans — appears settled even if the specific number is still being negotiated.

Q: Why is NVIDIA the company most affected, not AMD?

NVIDIA dominates the AI GPU market with approximately 80–85% market share in high-performance AI accelerators. AMD's MI325X is included in the same cap framework, but AMD's China data center exposure is a fraction of NVIDIA's. The 75,000-chip cap constrains NVIDIA's upside from the January 2026 H200 license approvals more than any other company in the supply chain.

TL;DR: The Biden-era export control framework is giving way to something more surgical. The Trump administration is considering a per-customer cap of 75,000 NVIDIA H200 GPUs for Chinese buyers, with AMD's MI325X counting toward the same ceiling. Alibaba and ByteDance have each requested more than 200,000 chips. Zero H200s have shipped to China since January 2026 licenses were approved. A proposed total China-wide ceiling sits at 1 million chips across all customers. At $25,000–$30,000 per unit, the proposed cap represents a hard limit on Chinese AI training infrastructure that could constrain frontier model development for years.

What you will learn

The 75,000-chip cap: what is actually being proposed
Why Alibaba and ByteDance want 200,000+ chips each
Zero H200s shipped to China since January: the gap between approval and delivery
AMD MI325X counts toward the same cap: what that means
The 1 million chip ceiling: a national quota for Chinese AI compute
Export control escalation timeline: from H100 to per-customer limits
What 75,000 H200s actually buys in AI training capacity
China's response: domestic alternatives and the Huawei play
What this means for NVIDIA's revenue
The geopolitical logic behind per-customer limits
How Chinese labs are likely to respond
Frequently asked questions

The 75,000-chip cap: what is actually being proposed

The number that matters most is 75,000 — and the framing it sits inside.

Previous US export controls on AI chips operated as categorical bans: specific chips above certain performance thresholds were blocked from sale to China entirely. The H100 was banned. Then a softened H800 variant was allowed, then banned again. The A100 followed a similar path. The controls were blunt instruments — either a chip was exportable or it was not.

The proposed H200 framework is different in kind. Rather than a categorical ban, the administration is considering a per-customer allocation system: each approved Chinese buyer can import up to 75,000 H200 GPUs total, with that allocation shared across NVIDIA and AMD hardware that clears the same performance threshold. Companies want more, they apply for more, and the government decides whether to grant an exception or hold the line.

The policy is still being deliberated inside the Commerce Department's Bureau of Industry and Security. It has not been formally codified into regulation. But the direction is clear: the US is moving from chip-level export controls toward customer-level compute quotas. This is a more granular, more durable, and more enforceable framework than anything that has come before it.

Proposed Limit	Detail
Per-customer chip cap	75,000 H200-equivalent GPUs
Total China ceiling	~1 million chips across all customers
Chips counted toward cap	NVIDIA H200, AMD MI325X (and equivalents)
H200s shipped to China since Jan 2026	0
Alibaba request	200,000+
ByteDance request	200,000+
H200 unit price	~$25,000–$30,000
Dollar value of one cap allocation	~$1.875B–$2.25B at list price

The implications of this shift are significant. Under a categorical ban, Chinese buyers know the rules immediately and plan around them. Under a quota system, buyers must apply, justify, and wait — creating ongoing leverage for the US government and ongoing uncertainty for Chinese AI labs trying to plan multi-year infrastructure build-outs.

Why Alibaba and ByteDance want 200,000+ chips each

The gap between 75,000 and 200,000+ is not a negotiating position. It reflects genuine demand driven by the scale of AI training workloads at frontier capability levels.

Both Alibaba and ByteDance are building AI systems that compete directly with frontier Western models. Alibaba's Qwen series has consistently benchmarked at or near GPT-4 level capability across multiple evaluations. ByteDance's Doubao and underlying Doubao-pro-32k models serve hundreds of millions of users on TikTok and Douyin. Both companies are training the next generation of models, and training at frontier scale requires compute that scales with model parameter count, context length, and training data volume.

The math on why 200,000 chips is not an arbitrary number: training a frontier-scale model in the ~700B parameter range with a trillion-token dataset on H200 clusters requires roughly 100,000–180,000 GPU-hours of continuous operation. At 100% utilization — which no real cluster achieves — that requires thousands of chips running for months. A cluster of 200,000 H200s represents roughly 16 exaflops of BF16 compute, comparable to what the largest Western training runs are using in 2025–2026.

Both companies are also building inference infrastructure alongside training clusters. Serving a model at Alibaba's or ByteDance's user volume requires sustained GPU capacity measured in the tens of thousands of chips just for inference. The 200,000-chip requests are not padding — they reflect realistic infrastructure planning for companies competing at global AI scale.

The 75,000-chip cap, if enforced, cuts that ambition roughly in half. That is precisely the intent.

Zero H200s shipped to China since January: the gap between approval and delivery

In January 2026, the Commerce Department approved a set of H200 export licenses for specific Chinese customers. The licenses were real. The chips have not moved.

The reason is procedural but consequential: Commerce approved the licenses but has not yet finalized the regulatory framework governing allocation, reporting, and end-use verification. Without that framework in place, the licenses sit in a legal limbo. NVIDIA cannot export under licenses whose compliance obligations are undefined. The Chinese customers cannot take delivery without triggering potential liability for both sides.

The zero-shipment status as of early March 2026 is not NVIDIA dragging its feet. NVIDIA has every financial incentive to ship. The company generates approximately $25,000–$30,000 per H200 in revenue, and the approved orders represent billions of dollars in deferred revenue. The delay is the US government's internal process failing to keep pace with its own licensing decisions.

This gap matters for two reasons. First, it means Chinese AI labs are computing on existing H800s, A100s, and Huawei Ascend chips while they wait — which is a real constraint on training progress. Second, it means the policy debate over the 75,000-chip cap is happening simultaneously with zero practical effect from the January approvals, creating a window where the government can change the rules before any chips actually ship.

The administration may use that window deliberately — finalizing a new framework that supersedes the January licenses before a single H200 reaches a Chinese data center.

AMD MI325X counts toward the same cap: what that means

The inclusion of AMD's MI325X in the same allocation ceiling as the NVIDIA H200 is a policy detail with significant implications.

The MI325X — AMD's current flagship AI accelerator — delivers performance roughly comparable to the H200 on inference workloads and competitive on training at certain model sizes. Its inclusion in the same regulatory bucket signals that the US government is controlling by compute tier, not by company. Any chip above a certain performance threshold counts against the Chinese customer's quota, regardless of whether it comes from NVIDIA or AMD.

Chip	Company	Approx. Performance (FP8 FLOPS)	Included in Cap
H200	NVIDIA	~1,979 TFLOPS	Yes
MI325X	AMD	~1,307 TFLOPS (FP8)	Yes
H100	NVIDIA	~1,979 TFLOPS (FP8)	Banned separately
Blackwell B200	NVIDIA	~4,500 TFLOPS	Banned
Ascend 910C	Huawei	~256 TFLOPS (estimated)	Not covered (domestic)

This framework closes a loophole that existed in earlier export control rounds, when Chinese buyers who were blocked from NVIDIA hardware could simply pivot to AMD. Under the proposed per-customer cap, buying MI325X chips eats into the same 75,000-chip allocation as buying H200s. There is no arbitrage between vendors.

The practical effect is that Chinese AI labs face a unified compute ceiling across all US-origin high-performance AI chips, rather than separate ceilings that could be combined to access more total compute. The policy is more airtight than previous approaches — and harder for chip vendors to work around through product redesign.

The 1 million chip ceiling: a national quota for Chinese AI compute

The per-customer limit of 75,000 chips sits inside a broader proposed framework: a total China ceiling of approximately 1 million H200-equivalent chips across all approved customers combined.

At 75,000 per customer, that implies the government anticipates approving roughly 13–14 major Chinese customers total — consistent with the actual population of Chinese AI companies large enough to make credible use of that kind of compute. The list would include Alibaba, ByteDance, Tencent, Baidu, Huawei's cloud division, SenseTime, Zhipu AI, and a handful of others.

The national ceiling functions as a macro-level constraint on Chinese AI training capacity. Even if individual customers receive their full 75,000-chip allocations, China's total approved high-performance AI compute from US-origin chips would be capped at roughly 80 exaflops of BF16 compute — assuming full H200 allocation. For context, estimates suggest the US currently operates somewhere between 800 exaflops and 2 exaflops of total AI training compute when hyperscaler investments are combined.

The gap is the point. Export controls are not designed to freeze Chinese AI development entirely — that would require controlling hardware already inside China, which is not practically achievable. The controls are designed to prevent China from compounding its advantage with additional US-origin compute at a rate that closes the training-compute gap. The 1 million chip ceiling, if maintained, keeps that gap from narrowing through hardware imports.

Export control escalation timeline: from H100 to per-customer limits

Understanding where the 75,000-chip cap fits requires tracing how US chip export controls have escalated over the past three years.

October 2022: The Biden administration's first major chip export rules banned the sale of A100 and H100 GPUs to China and Russia. The controls targeted chips above a specific performance-compute threshold. NVIDIA immediately began designing the A800 and H800 — variants with artificially reduced interconnect bandwidth to fall below the threshold.

October 2023: The Commerce Department closed the A800/H800 loophole by updating the performance threshold. The H800 was banned. NVIDIA began designing the H20 — a dramatically downgraded chip built specifically for the post-2023 control regime.

January 2025: The outgoing Biden administration introduced the "AI Diffusion Rule," creating a three-tier country system: Tier 1 (close allies, no restrictions), Tier 2 (most countries, volumetric limits), Tier 3 (China, Russia, near-total ban). The rule was complex, drew industry criticism, and set up the transition for the Trump administration.

January 2026: The Trump administration began reviewing the AI Diffusion Rule and approved a set of H200 export licenses for specific Chinese customers — a notable departure from the Biden-era approach. Zero chips shipped due to the framework gap described above.

February–March 2026: Per-customer cap of 75,000 H200s and a national ceiling of 1 million chips under active deliberation.

The trajectory is not linear escalation. It is iterative refinement: each round of controls creates workarounds, those workarounds get closed, and the framework grows more sophisticated with each cycle. The per-customer quota is the most architecturally sophisticated approach yet — it is harder to design around than a performance threshold.

What 75,000 H200s actually buys in AI training capacity

The 75,000-chip cap only makes sense if you understand what it means in concrete AI training terms.

A single NVIDIA H200 delivers approximately 3.35 terabytes per second of memory bandwidth and 1,979 TFLOPS of BF16 floating-point performance. In a tightly coupled cluster, H200s communicate via NVLink at 900 GB/s per GPU — the interconnect bandwidth that makes large-model training practical by minimizing the bottleneck of passing gradients between GPUs during backpropagation.

A cluster of 75,000 H200s provides roughly 148 exaflops of BF16 compute at theoretical peak. Real-world training efficiency on large clusters typically runs at 30–50% of theoretical peak due to communication overhead, batch management, and memory constraints. That means a realistic effective compute of roughly 45–75 exaflops for training workloads.

Training Scenario	GPU Count Required	H200s vs Cap
GPT-4 class model (1T params)	~25,000–40,000 H200s	Within cap
Gemini Ultra class (multimodal, 1T+)	~60,000–100,000 H200s	At or exceeds cap
Next-generation frontier (2T+ params)	~120,000–200,000 H200s	2–3x over cap
Production inference (100M+ daily users)	~15,000–30,000 H200s	Well within cap

The practical reading: 75,000 H200s is enough to train GPT-4-class models and enough to run significant inference infrastructure. It is not enough to train the frontier models of 2027–2028, which are projected to require 2–4x the compute of today's largest training runs. The cap is calibrated to allow China to remain competitive at current capability levels while constraining its ability to leapfrog to the next generation.

China's response: domestic alternatives and the Huawei play

The US export control framework only constrains Chinese AI development to the extent that China lacks viable domestic alternatives. The answer to whether those alternatives exist is: partially, and improving faster than the controls anticipated.

Huawei's Ascend 910B and 910C chips are the most credible domestic substitutes. The Ascend 910C delivers an estimated 256 TFLOPS of BF16 compute per chip — roughly 7–8x less than the H200 on a per-chip basis. To replicate the training capacity of 75,000 H200s, a Chinese lab would need approximately 500,000–600,000 Ascend chips. Huawei does not currently produce at that volume. Its chip manufacturing relies on SMIC's 7nm process, which has yield constraints that limit the realistic supply.

But the trajectory matters more than the current snapshot. Huawei has been investing aggressively in chip design and domestic manufacturing capacity. The Ascend 910C launched in late 2025. The 910D is reportedly in development with projected performance improvements of 30–40%. If Huawei ships a chip within 3x of H200 performance by 2027, the calculus changes materially.

In the meantime, Chinese labs are adapting their training strategies to lower-performance hardware: more efficient model architectures (the DeepSeek-style mixture-of-experts approach that achieves frontier capability with less raw compute), longer training runs on smaller clusters, and algorithmic innovations that substitute for compute. The export controls are working — but they are also accelerating exactly the kind of efficiency-focused AI research that allows China to close the gap through software rather than hardware.

What this means for NVIDIA's revenue

NVIDIA is caught between two realities: the US government is its regulatory constraint, and China is one of its largest potential markets.

Prior to the October 2022 export controls, China represented approximately 25% of NVIDIA's data center revenue. After successive rounds of controls, NVIDIA developed China-specific chips (A800, H800, H20) designed to remain compliant. The H20 — a chip dramatically downgraded from the H100 to meet the post-2023 performance threshold — has been NVIDIA's primary China data center product since late 2023.

The proposed 75,000-chip cap on H200s does not directly affect H20 sales — the H20 falls below the performance threshold that triggers the cap. Chinese buyers can still purchase H20s in unlimited quantities (subject to general export licensing). The cap only bites on the higher-performance H200.

But the cap does constrain the upside scenario that NVIDIA was potentially banking on: the January 2026 H200 license approvals suggested a possible thaw that would allow NVIDIA to sell its actual flagship product to Chinese hyperscalers. A 75,000-chip per-customer limit, if finalized, caps that revenue opportunity at roughly $1.875B–$2.25B per Chinese customer at list price — significant, but far below the unconstrained demand from companies requesting 200,000+ chips.

At NVIDIA's current $130.5B annual revenue run rate, China data center revenue under the cap regime is a rounding error relative to what the US, European, and Japanese markets are delivering. The financial impact is real but manageable. The strategic impact — NVIDIA ceding the high-performance end of the Chinese market to Huawei — is harder to recover from long-term.

The geopolitical logic behind per-customer limits

The shift from categorical bans to per-customer quotas reflects a more mature theory of how export controls work in practice.

Categorical bans are simple to administer but create a binary dynamic: either China has access to a chip or it does not. When a ban is in place, China-based buyers have strong incentives to acquire banned chips through third countries, shell companies, and gray-market channels. The US has documented hundreds of cases of H100 diversion through Southeast Asian intermediaries since the October 2022 controls took effect.

A per-customer quota system changes the incentive structure. Approved customers have a legitimate pathway to acquire significant compute — up to $2.25B worth of H200s at list price — which reduces the economic incentive to take legal and reputational risk through gray channels. The government gains ongoing visibility into which companies are acquiring what compute, enabling end-use monitoring that categorical bans make impossible (if everything is banned, there is nothing to monitor).

The quota system also creates diplomatic leverage that categorical bans do not. The US government can adjust individual customer allocations in response to behavior, security assessments, or bilateral negotiations. A company that complies with end-use reporting requirements might see its allocation increased. A company found to be diverting chips to entities of concern might see its allocation revoked. This is a more nuanced instrument than a binary on/off control.

The 75,000-chip number itself is likely the output of analysis on what compute level allows frontier AI research without enabling military-relevant AI capabilities at the scale the US intelligence community considers most dangerous. The specific figure is less important than what it represents: a deliberate choice to maintain conditional access rather than a categorical embargo.

How Chinese labs are likely to respond

Chinese AI companies are not passive observers of US export control policy. They plan around it, lobby against it, and build alternatives to reduce dependency on it. The response to a 75,000-chip cap is already visible in the decisions companies made before the cap was formally proposed.

Stockpiling: Both Alibaba and ByteDance accelerated H800 and H20 procurement in 2024 and early 2025, anticipating tighter controls. Chinese AI labs collectively hold substantial inventories of lower-performance but still functional US-origin AI chips that will remain in service for years.

Architecture optimization: DeepSeek's V3 and V4 models demonstrated that frontier-level capability is achievable with dramatically less compute than the conventional approach. Chinese labs are investing heavily in the research that allows them to train competitive models on smaller clusters. The 75,000-chip cap is, paradoxically, accelerating this research by making compute efficiency an existential priority.

Huawei clustering: The practical challenge with Huawei Ascend chips is not individual chip performance — it is cluster interconnect. Huawei's CANN software stack and HiLink interconnect do not match NVIDIA's NVLink ecosystem for large-scale training. Chinese labs are investing in software to close this gap, with varying success depending on model size and architecture.

Regulatory arbitrage: Some Chinese companies are exploring whether AI compute acquired through cloud providers in non-restricted jurisdictions — such as Singapore, the UAE, or Malaysia — can be used for workloads that the US would otherwise restrict. The Commerce Department is aware of this pathway and has added several non-US jurisdictions to enhanced monitoring lists.

The net effect is that export controls are working as intended — creating friction, cost, and delay for Chinese AI training — while also accelerating exactly the kind of indigenous technological development that makes those controls less effective over time.

Frequently asked questions

What is the 75,000-chip cap and is it final?

The 75,000-chip cap is a proposed per-customer limit on NVIDIA H200 GPU exports to Chinese buyers, under active deliberation inside the Commerce Department's Bureau of Industry and Security as of early March 2026. It has not been codified into regulation. The administration could finalize it, modify the number, or restructure the framework entirely. The direction — toward per-customer compute quotas rather than categorical bans — appears settled even if the specific number is still being negotiated.

Why is NVIDIA the company most affected, not AMD?

NVIDIA dominates the AI GPU market with approximately 80–85% market share in high-performance AI accelerators. AMD's MI325X is included in the same cap framework, but AMD's China data center exposure is a fraction of NVIDIA's. The 75,000-chip cap constrains NVIDIA's upside from the January 2026 H200 license approvals more than any other company in the supply chain.

Can Chinese companies get around the cap through third countries?

Diversion through third countries has occurred with H100 chips and is a documented risk the US government is actively working to close. The per-customer quota system provides better end-use monitoring infrastructure than categorical bans, making large-scale diversion harder to conduct without detection. However, the US intelligence community assesses that some diversion will continue regardless of the framework. The controls are designed to limit, not eliminate, Chinese access to US-origin high-performance AI chips.

What happens to the January 2026 H200 licenses if the cap is finalized?

The January 2026 licenses that approved specific Chinese customers for H200 purchases have not resulted in any shipments. If the 75,000-chip cap framework is finalized, those licenses would likely be revised to conform with the new per-customer limit. Companies that were approved for larger quantities under the January framework would see their effective allocation reduced to 75,000 chips.

Does the H20 chip count toward the cap?

The H20 — NVIDIA's China-specific chip with significantly reduced performance — does not appear to meet the performance threshold that triggers the proposed cap. Chinese buyers can continue purchasing H20s in larger quantities, subject to standard export licensing. The cap specifically targets H200-class performance, which the H20 falls well below by design.

How does the 75,000-chip limit compare to US company allocations?

US-based companies face no chip acquisition limits. Microsoft, Google, Amazon, and Meta are acquiring H100s, H200s, and Blackwell B200s in quantities of 100,000–500,000+ units across their respective data center build-outs. The asymmetry is intentional: the US maintains a compute advantage by allowing domestic hyperscalers to scale unconstrained while capping Chinese competitors at a fraction of that capacity.

What does this mean for Chinese AI competitiveness in 2026–2027?

Chinese AI labs will remain competitive at current frontier capability levels — training models comparable to GPT-4 and Claude 3.5 is achievable within the 75,000-chip cap. The constraint bites on the next generation: models projected to require 150,000–300,000 GPU-equivalents of training compute. If the cap holds, China enters the 2027–2028 frontier model generation with a structural compute disadvantage relative to US-based labs — unless domestic Huawei infrastructure scales faster than current projections suggest.

Let's Build Something Together

US weighs 75,000-chip cap per customer on NVIDIA H200 exports to China

Weekly Newsletter