Apple asks Google to build dedicated cloud servers for Gemini-powered Siri
Apple is negotiating with Google to build dedicated cloud infrastructure for a Gemini-powered Siri overhaul, reportedly offering ~$1B/year in licensing fees for iOS 26.4.
Whether you're looking for an angel investor, a growth advisor, or just want to connect — I'm always open to great ideas.
Get in TouchAI, startups & growth insights. No spam.
TL;DR: Apple is in active negotiations with Google to deploy Gemini as the AI backbone for a major Siri overhaul, with reports indicating a ~$1 billion per year licensing arrangement and — crucially — a demand for dedicated, non-shared Google cloud infrastructure built to Apple's privacy specifications. The target launch window is iOS 26.4, expected later in 2026. For the company that built its brand on "what happens on your iPhone stays on your iPhone," outsourcing Siri's intelligence to the world's largest ad-targeting company is the most consequential — and ironic — strategic bet in consumer tech.
The core of the reported negotiation: Apple wants Google's Gemini models to power a substantially upgraded Siri, and it wants that capability delivered through dedicated cloud infrastructure that Google would build and operate exclusively for Apple — not on Google's shared commercial cloud.
This is not a standard enterprise API deal. Apple is not asking for access to Google's Gemini API endpoints on the same terms available to any developer with a Google Cloud account. The reported demand is structural: Apple wants Google to provision dedicated servers, isolated from Google's multi-tenant cloud, running Gemini model weights configured and tuned to Apple's specifications, accessible only through Apple's own routing infrastructure.
The scope of that ask — and Google's reported willingness to accommodate it — signals how seriously both companies are treating this negotiation. For Google, building and maintaining dedicated infrastructure for a single customer at this scale is a significant operational commitment. For Apple, demanding it signals that shared infrastructure is a non-starter, regardless of whatever contractual privacy guarantees Google might offer.
The reporting on this deal has primarily come from Bloomberg's Mark Gurman, whose track record on Apple supply chain and partnership details over the past decade has made him the most reliable source for pre-announcement Apple deal reporting. Gurman's framing positions the dedicated infrastructure requirement as Apple's primary technical condition for the deal to proceed — without it, Apple does not trust the privacy architecture sufficiently to use Gemini for Siri.
Apple's insistence on dedicated infrastructure is not arbitrary. It reflects a specific technical and legal concern about what "private" actually means in a cloud context.
When AI inference runs on shared cloud infrastructure, the underlying hardware — CPUs, GPUs, networking fabric, storage — services multiple customers' workloads simultaneously. Modern hypervisors and containerization prevent one customer's data from being directly readable by another. But shared infrastructure creates several categories of risk that Apple's privacy posture cannot accept:
Inference logs. In a standard cloud API setup, the cloud provider maintains logs of inference requests — which model received what input, at what time, at what scale. Even if input content is not stored, metadata about Apple's inference volume, query patterns, and feature usage would be visible to Google as the infrastructure operator. For Apple, that metadata is itself sensitive.
Model tuning visibility. If Apple trains or fine-tunes Gemini models on Apple-specific data — user preferences, app interaction patterns, Siri personalization signals — that training data would need to pass through Google's infrastructure. On dedicated servers, Apple can structure that data flow under its own security controls. On shared infrastructure, the chain of custody is Google's.
Regulatory compliance. Apple operates across jurisdictions with significantly different data-localization and processing requirements — GDPR in Europe, data sovereignty rules in China, various state privacy statutes in the United States. Dedicated infrastructure allows Apple to configure data routing such that EU user queries only ever reach EU-hosted servers, Chinese user queries only reach China-compliant infrastructure, and so on. Shared multi-tenant cloud makes those routing guarantees significantly harder to enforce and audit.
Contractual enforceability. Apple's privacy commitments to users are not just marketing — they create legal obligations in multiple jurisdictions. A dedicated infrastructure arrangement allows Apple to write contracts with Google that directly govern the physical servers processing user data, rather than relying on Google's general cloud terms of service applied uniformly across all customers.
The dedicated infrastructure demand is, in this reading, Apple's minimum viable privacy architecture for any AI model it cannot run entirely on-device.
The ~$1 billion annual licensing figure reported for this deal is worth contextualizing against Apple's existing AI-adjacent revenue arrangements.
Apple currently pays Google an estimated $15–20 billion per year to be the default search engine in Safari across all Apple devices. That agreement — under scrutiny in the Department of Justice's antitrust case against Google — represents one of the largest business-to-business technology deals in history. The Gemini licensing fee, at $1 billion annually, is a fraction of that relationship but still substantial.
For comparison: OpenAI's deal to power ChatGPT integration within Apple Intelligence (announced at WWDC 2024) is believed to involve a revenue-sharing arrangement rather than a flat licensing fee, with Apple directing significant user traffic to ChatGPT in exchange for integration access. The Gemini deal, structured as a per-year licensing payment, is structurally different — Apple would be paying for model capability regardless of usage volume.
The $1 billion figure also needs to be understood against the cost of the dedicated infrastructure Apple is requesting. Building and operating dedicated GPU clusters sufficient to handle Siri inference at iPhone-scale usage — hundreds of millions of active devices, each potentially generating dozens of inference requests per day — is not a trivial infrastructure investment. The licensing fee is partially compensation for the model access and partially offset for Google's cost of building Apple-specific infrastructure that cannot be amortized across other customers.
At 1.5 billion active Apple devices worldwide, $1 billion annually works out to roughly $0.67 per device per year in licensing cost — a rounding error against Apple's average revenue per device, but a meaningful line item when you consider the competitive stakes attached to Siri's quality.
The iOS 26.4 target launch window positions this Gemini-powered Siri upgrade as a mid-cycle release rather than a headline WWDC feature.
iOS versioning context: Apple shipped iOS 26 at the fall 2025 cycle. iOS 26.4 would be the fourth major point release of that cycle, historically arriving in the late spring or early summer of the following calendar year — placing the target window somewhere around May–June 2026, coinciding with WWDC 2026 or immediately preceding it.
The .4 release number is significant. Apple uses point releases for feature additions rather than bug fixes (which come in .0.x updates). If Gemini-powered Siri is in iOS 26.4, it signals Apple is treating this as an additive capability rollout — not a ground-up Siri replacement, but an enhancement to the existing Apple Intelligence Siri architecture that shipped with iOS 26.
What the overhaul is expected to include:
The .4 timeline also suggests this is not yet a finalized deal. Shipping a capability in May or June 2026 requires completed integration work several months in advance. If negotiations are still ongoing as of early March 2026, the timeline is tight.
February 2026 produced a pair of Apple executive statements about the Siri-Gemini deal that directly contradicted each other — a communications miscalculation that generated significant analyst and press coverage.
Statement one: Apple CEO Tim Cook, asked about the reported Gemini integration during a Q1 2026 earnings call question, described the scope as a potential "limited integration" for specific query types where on-device models were insufficient. Cook's framing was minimizing — he positioned any cloud AI partnership as narrow and auxiliary, not structural.
Statement two: Apple SVP of Machine Learning and AI Strategy (unnamed in the original reporting) gave a separate briefing to a small group of media in which the Gemini integration was characterized as intended to become the "primary intelligence backbone" for Siri's next generation, with on-device processing handling only latency-sensitive tasks and personalization.
The gap between "limited integration for specific query types" and "primary intelligence backbone" is not a semantic difference. It represents fundamentally different product architectures. Cook's framing would produce a Siri that remains primarily on-device with occasional Gemini augmentation — similar to how ChatGPT currently plugs into Apple Intelligence for complex queries. The SVP framing would produce a Siri that routes the majority of its reasoning through Gemini, with on-device models handling only the fast-response layer.
Analysts covering Apple — including those at Morgan Stanley, Bernstein, and Wedbush — flagged the contradiction publicly. The most charitable interpretation is that the deal's scope was still being internally debated in February and that different Apple executives were briefed on different versions of the negotiating position. The less charitable interpretation is that Apple was testing market reaction to different framings of the deal before committing to a public narrative.
The contradictory statements have not been resolved with a clarifying announcement as of March 3, 2026.
The central tension of this deal is impossible to ignore: Apple has built one of the world's most recognized consumer brands on privacy, and it is negotiating to route Siri queries through Google — the company whose core business model is advertising revenue derived from understanding what users are interested in, want to buy, and spend time thinking about.
Apple's response to this tension, when pressed, is architectural: the dedicated infrastructure requirement, combined with contractual prohibitions on Google using Siri query data for any purpose other than inference, creates a technical and legal barrier between Gemini processing Apple user queries and Google's advertising systems.
The question is whether that architecture is genuinely verifiable.
When Apple deployed Private Cloud Compute for its own server-side Apple Intelligence processing, it did something unusual: it published the source code for the Private Cloud Compute attestation system, allowing independent security researchers to verify that Apple's claims about not being able to inspect processed data were accurate at the cryptographic level. The architecture was auditable. Third parties could — and did — review it.
With a Google-operated dedicated infrastructure arrangement, the equivalent transparency is significantly harder to achieve. Apple can audit the contractual terms. Apple can inspect the server configurations at time of setup. Apple's security team can conduct periodic reviews. But Google would be operating the infrastructure, and the depth of Apple's ongoing visibility into what actually happens on those servers in real time is limited by operational realities.
The privacy guarantee for this deal is ultimately a contractual and audit-based guarantee rather than a cryptographic one. That is a meaningful distinction for a company whose privacy marketing has consistently implied that users do not need to take anything on trust.
Apple's counter-argument — which has not been made publicly but is implied by the architecture demand — is that the dedicated infrastructure requirement, combined with the prohibition on shared tenancy, creates a meaningful technical boundary: Google cannot use dedicated-Apple infrastructure for non-Apple workloads by definition, which eliminates the most direct path through which Siri data could contaminate Google's user modeling systems.
Whether that argument satisfies Apple's existing customers is a question the company will need to answer when this deal becomes public through an official announcement.
Apple's original Apple Intelligence architecture, announced at WWDC 2024 and shipped with iOS 18.1, established a clear hierarchy: on-device processing first, Private Cloud Compute second, third-party AI (ChatGPT) last resort.
The design philosophy was explicit: personal data should leave the device as rarely as possible, and when it must, it should go to Apple's own servers before going anywhere else. Third-party AI access was positioned as opt-in, user-confirmed, and isolated to specific query types.
The Gemini integration, as reported, would structurally invert that hierarchy for complex Siri tasks. If Gemini becomes the primary reasoning engine for Siri, the architecture becomes: on-device for latency-sensitive fast responses, Gemini cloud for complex reasoning. Apple's own Private Cloud Compute remains relevant for the layer between on-device and Gemini — but the third-party cloud layer moves from "last resort" to "primary intelligence backbone."
This is not inherently a wrong architectural choice. The computational gap between what runs on a Neural Engine and what runs on a warehouse-scale Google TPU cluster is real and growing. Apple's on-device models are impressively capable given their size constraints, but they cannot match the reasoning depth of Gemini Ultra at current hardware scales. If the goal is to make Siri competitive with Google Assistant, ChatGPT, and Claude in 2026, the on-device-first architecture is a ceiling, not a strength.
But the shift has user-facing implications. Apple's marketing around Apple Intelligence has consistently emphasized that personal context — your emails, messages, calendar, photos — stays on your device or on Apple's Private Cloud Compute. A Gemini-powered Siri that performs complex reasoning across your personal data requires sending that personal context to Google's infrastructure for Gemini to reason over it. The dedicated servers and contractual prohibitions address the data-use concern, but the data still travels to Google's infrastructure.
Users who accepted Apple's privacy framing may not have anticipated that scenario when they opted into Apple Intelligence.
Apple already has one external AI partnership for Siri: the ChatGPT integration that shipped with iOS 18.2 and is available across Apple Intelligence. Understanding how the reported Gemini deal differs from the ChatGPT arrangement clarifies why this negotiation is more consequential.
ChatGPT integration (current):
Gemini integration (reported):
The ChatGPT integration is a referral arrangement. The Gemini integration, if structured as reported, is a core infrastructure dependency. The distinction matters for both product architecture and regulatory risk: core infrastructure dependencies create vendor lock-in concerns, update coordination requirements, and — from a regulatory perspective — potential scrutiny under competition law in jurisdictions where both Apple and Google are already facing antitrust investigations.
The DOJ's ongoing Google antitrust case explicitly targets the existing Apple-Google search default arrangement. A new Apple-Google AI infrastructure arrangement would almost certainly draw similar scrutiny, particularly given that it would further deepen the financial relationship between the two companies.
If the Gemini-powered Siri overhaul ships in iOS 26.4, the competitive landscape for AI assistants resets in a way that benefits both Apple and Google — while applying pressure to every other player.
For Apple: Siri becomes competitive with the best-in-class AI assistants without Apple needing to build and maintain frontier model capabilities at scale. Apple's investment concentrates on the user experience layer — how Siri presents, confirms, and acts on Gemini's reasoning — rather than on the model itself. This is consistent with Apple's historical approach to building on best-available components (it does not make its own display panels, NAND flash, or cellular modems) while controlling the integration and user experience.
For Google: The Gemini deal with Apple validates Gemini as enterprise-grade AI infrastructure at the highest-stakes possible client deployment. Google does not need Apple users to open Google apps — if Gemini powers Siri, Google's model is the reasoning engine behind every Siri interaction on 1.5 billion active Apple devices. The advertising revenue implications are constrained by contract, but the model training implications — learning from the distribution of query types that real-world Siri users ask — are significant even if individual user data is protected.
For OpenAI: ChatGPT's current Siri integration position is threatened. If Gemini becomes the primary Siri intelligence backbone rather than an opt-in extension, ChatGPT's role in the Apple ecosystem diminishes. Apple may retain ChatGPT for specific creative or writing-oriented query types, but the architecture shifts ChatGPT from "Siri's AI partner" to "one of several optional AI tools accessible through Siri."
For Microsoft/Copilot and Amazon/Alexa: The gap between Gemini-powered Siri and competing assistants widens. Neither Microsoft nor Amazon has a comparable infrastructure deal with Apple, and neither has the consumer device footprint to make a countervailing offer.
For users: A Gemini-powered Siri that actually works — that can reason across documents, complete multi-step tasks reliably, and understand context that spans apps and data sources — would be a materially different product than the Siri that has been the butt of productivity jokes for a decade. The competitive pressure this places on Google Assistant (somewhat self-competitive), ChatGPT, and Cortana is significant.
Negotiations of this complexity and scale do not always close. The dedicated infrastructure requirement, the licensing economics, the privacy architecture details, and the internal Apple disagreement about scope all represent potential failure points.
If the Gemini deal collapses, Apple's Siri options are:
Continue with current architecture: Apple Intelligence Siri as shipped in iOS 26 — capable for simple tasks, competitive for on-device personalization, meaningfully behind frontier cloud models for complex reasoning. This is not a sustainable competitive position as competing assistants continue to improve.
Accelerate in-house model development: Apple's AI research organization has grown significantly, and the company has been training larger models internally. A collapse of the Gemini deal could accelerate Apple's timeline for deploying its own cloud-scale reasoning models, potentially through expanded Private Cloud Compute capacity. The timeline for this path is measured in years, not months.
Deepen the Anthropic relationship: Apple has had conversations with Anthropic — the maker of Claude — alongside OpenAI and Google. Anthropic's Constitutional AI approach and its explicit focus on safety and privacy-alignment make it arguably the most philosophically compatible external AI partner for Apple. A Claude-powered Siri would carry fewer brand contradiction concerns than a Gemini-powered Siri. The challenge is scale: Anthropic's inference infrastructure is not at Google's scale, and a dedicated infrastructure arrangement would require significant Anthropic capital investment.
Expand ChatGPT integration: OpenAI's existing Siri integration could be expanded from opt-in to primary. This would raise the same privacy contradiction concerns as Gemini — OpenAI's business model includes using API interactions for model improvement, and the privacy architecture of an expanded ChatGPT arrangement would require the same kind of dedicated infrastructure negotiation that Apple is reportedly demanding from Google.
Each alternative carries costs. The Gemini deal, despite its strategic ironies, remains the path of least resistance if the infrastructure and privacy requirements can be resolved.
Under the reported deal structure, yes — your queries would be processed on Google-operated infrastructure, though dedicated servers isolated from Google's commercial cloud. Apple is demanding contractual prohibitions on Google using Siri query data for advertising targeting or model training beyond what is necessary to serve your Siri requests. Whether those contractual guarantees are verifiable in practice is the central privacy question this deal raises.
Apple is building them — the Apple Intelligence on-device models and Private Cloud Compute server-side models represent real, significant AI investment. The gap is at the frontier reasoning layer. Training and deploying a model that matches Gemini Ultra's multi-step reasoning capability requires infrastructure investment — GPU clusters, data pipelines, research talent — at a scale that takes years to build from a standing start. Licensing Gemini is the faster path to competitive capability.
Structurally similar in that it is a large B2B payment from Apple to Google for a core device capability, but different in kind. The search deal makes Google the default search engine in Safari — a referral arrangement. The Gemini deal, as reported, would make Google's AI model the reasoning engine inside Siri — an infrastructure dependency. The competitive and regulatory implications are different, and the antitrust scrutiny is likely to be different as well.
No official announcement has been made as of March 3, 2026. If the deal closes in time for iOS 26.4, Apple would likely announce it at or near WWDC 2026, which typically occurs in June. An earlier point-release announcement outside WWDC is possible if the deal closes and Apple wants to get ahead of continued leak reporting.
Not exactly. Apple Intelligence's on-device capabilities — writing tools, photo editing, notification summaries, personal context Siri — are genuinely capable and well-executed. The gap is at the frontier reasoning level: complex multi-step tasks, document-scale understanding, and the kind of agentic behavior that frontier cloud models enable. Partnering with Google for that layer does not mean on-device Apple Intelligence is being abandoned; it means Apple is being realistic about where the competitive gaps are and choosing speed-to-market over vertical integration for the frontier layer.
It is a significant test. Apple's privacy positioning has been one of its most durable marketing advantages, particularly against Google. A deal in which Siri's intelligence runs on Google infrastructure — regardless of architectural safeguards — requires Apple to either update its privacy narrative or risk the perception that the brand commitment is conditional on commercial convenience. How Apple communicates this deal to users, and what choices it gives them about Gemini routing, will determine whether the brand takes lasting damage or manages the transition without significant trust erosion.
Possibly. The DOJ's antitrust case against Google specifically targets exclusionary arrangements with Apple in the search context. A new AI infrastructure deal between the two companies would draw regulatory attention, particularly in the EU where both companies face separate competition enforcement. However, the legal theory for blocking an AI infrastructure deal is different from blocking a search default deal, and the timeline for regulatory action is typically longer than the product development timelines in play here. The deal could ship in iOS 26.4 and face regulatory review afterward.
Google's March 2026 Pixel Feature Drop expands Circle to Search globally, adds Gemini real-time visual understanding, AR overlays, and shopping integration to Pixel 7a and above.
Gemini 3.1 Pro scores 69.2 percent on the MCP Atlas benchmark, leading Claude and GPT-5.2 by 10 points with adjustable reasoning depth on demand.
Google launches Gemini 3.1 Flash-Lite with a Thinking Levels feature that lets developers tune reasoning depth per request, starting at $0.25 per million input tokens.