Apple Intelligence architecture clarified: Google Cloud, NVIDIA GPUs, and Apple’s own AI models explained
Apple has finally offered a clearer look at how its new Apple Intelligence system works, and the details reveal a far more complex AI strategy than the company initially suggested during its WWDC 2026 presentation.
At the center of the discussion is Apple Foundation Model, or AFM, the AI model family powering many of the company’s new intelligence features across iPhone, iPad, and Mac. Apple had previously explained that Apple Intelligence uses a mix of on-device AI and cloud-based processing, but the company did not fully clarify how much of the system was built internally and how much relied on outside partners.
Now, Apple has shed more light on the setup.
According to the latest clarification, AFM Cloud is Apple’s own model. However, it was reportedly distilled from a Google Gemini model, meaning Apple used Google’s advanced AI technology as part of the process to train or refine its own system. Apple is emphasizing that it performed its own pre-training, post-training, reinforcement learning, and optimization work, rather than simply deploying a Google model under an Apple name.
This distinction matters because Apple has been positioning Apple Intelligence as a privacy-focused, deeply integrated AI platform built around its own hardware, software, and security standards. At the same time, the company appears to be leaning on major external infrastructure providers to bring more powerful cloud-based AI features to users.
Apple’s cloud AI system is not one single model. The company has clarified that AFM Cloud is split into different versions. The most powerful version, known as Apple Foundation Model Cloud Pro, runs on NVIDIA GPUs inside Google Cloud. Meanwhile, the standard AFM Cloud model and Apple’s cloud-based image model run on Apple Silicon servers as part of Apple’s Private Cloud Compute framework.
This hybrid approach allows Apple to offer more advanced AI capabilities while still keeping privacy as a central selling point. For simpler tasks, Apple Intelligence can rely on local models running directly on the device. For more demanding requests, the system can turn to cloud-based models through Private Cloud Compute.
Apple has also detailed how Private Cloud Compute operates when hosted on Google Cloud. The system uses NVIDIA Confidential Computing with NVIDIA GPUs, Intel CPUs with TDX security technology, and Google’s Titan chip. Apple says this setup is designed to preserve the same privacy-focused protections that it promotes across its own silicon-based infrastructure.
The company says its Google Cloud-based Private Cloud Compute deployment includes transparency guarantees that allow outside security researchers to verify its privacy claims. Apple also says it maintains a cryptographically verifiable, append-only ledger of all Google Cloud hardware used in the Private Cloud Compute fleet. This is intended to reduce the risk of supply chain attacks and give researchers a way to confirm which machines are part of the trusted system.
Apple’s security architecture also includes several layered protections. Initial network data parsing is handled in a separate process with its own namespace. Shared inference software is recycled frequently with a short time-to-live. Attested keys are stored inside a separate confidential virtual machine that is isolated from outside inputs. These measures are designed to limit exposure, reduce persistence risk, and prevent sensitive user data from being accessed by unauthorized parties.
The company also plans to provide public research tools and offer access to live Private Cloud Compute nodes in research mode through its security bounty program. That move is clearly meant to strengthen confidence in Apple’s privacy promises at a time when cloud AI systems are under growing scrutiny.
On the device side, Apple has also clarified its local AI model strategy. The more advanced on-device model, called AFM Core Advanced, contains 20 billion parameters and is designed entirely by Apple. It uses a sparse mixture-of-experts approach, meaning it does not need to activate the full model for every request. Instead, it loads only the necessary parameters required to process a specific prompt.
This design allows Apple to bring a relatively large AI model to mobile hardware without overwhelming the device. However, AFM Core Advanced requires the A19 Pro chip, meaning it will likely be limited to the newest high-end iPhone models.
Apple has also prepared a lighter on-device model for older iPhones and more general AI tasks. This ensures Apple Intelligence can work across a wider range of devices, even if the most advanced local features are reserved for newer hardware.
The way Siri and other Apple Intelligence features handle requests is also important. When a user makes a request, a local orchestrator decides which tools are needed, gathers the relevant information, and creates a structured prompt for the cloud model if cloud processing is required. Apple says raw user data is not sent directly to the cloud. Instead, only the structured prompt is transmitted, helping reduce privacy exposure.
This architecture shows Apple trying to strike a careful balance. The company wants to deliver competitive AI features that can rival other major platforms, but it also wants to preserve its reputation for privacy and device-level integration. That is why Apple is drawing a firm line between using outside infrastructure and giving external providers direct control over the user-facing AI experience.
Apple has also stressed that it is not simply embedding Google’s Gemini app or Google’s consumer-facing AI models into iOS. The company says it does not use Google client code for Apple Intelligence and does not rely on the same models Google deploys directly to its own customers. Instead, Apple appears to be using licensed technology and infrastructure to support its own model development and deployment.
Still, the latest details make one thing clear: Apple Intelligence is not purely an Apple-only system. It is a layered AI platform that combines Apple-designed on-device models, Apple Silicon cloud servers, Google Cloud infrastructure, NVIDIA GPUs, and technology influenced by Gemini model distillation.
For users, the practical result could be more powerful Siri responses, better writing tools, improved image generation, smarter app actions, and more capable personalized AI features across Apple devices. For Apple, the challenge will be convincing users and developers that this complex cloud partnership still meets the company’s high privacy standards.
Apple’s AI strategy may have arrived later than many expected, but the company is now moving quickly to explain how the system works. The new clarifications show a company trying to modernize its AI foundation without abandoning the privacy-first identity that has become central to the Apple brand.






