NVIDIA Squashes Vera Rubin Rumors, First Shipments Rolling Out In July To Major AI Customers With Mass Production In 2H 26

Morgan Stanley: Nvidia’s Pricey Blackwell AI Chips Still Outshine Cheaper Google and Amazon Rivals

Morgan Stanley: NVIDIA Blackwell AI GPUs Cost More, but Deliver Far Better Compute Efficiency

NVIDIA’s latest AI GPUs may be expensive, but Morgan Stanley believes the higher upfront cost can be justified by stronger compute efficiency and long-term performance advantages.

According to the investment bank, building a 1-gigawatt data center using NVIDIA’s current-generation Blackwell GPUs can cost up to twice as much as building a similar facility with custom AI chips from companies such as Google and Amazon. These custom chips include Google’s Tensor Processing Units, commonly known as TPUs, and Amazon’s Trainium processors.

However, Morgan Stanley argues that raw construction cost does not tell the full story. In artificial intelligence infrastructure, performance per watt is becoming one of the most important measures of value. On that metric, NVIDIA appears to hold a major lead.

The bank estimates that NVIDIA’s AI chips can deliver between 2 times and 8 times better compute performance per watt compared with custom AI ASICs. That advantage is important because AI data centers consume enormous amounts of power, and even small improvements in energy efficiency can translate into major savings at hyperscale.

The debate around NVIDIA’s pricing has become a major topic across the AI industry. NVIDIA CEO Jensen Huang has repeatedly said that while the company’s GPUs carry premium prices, they can generate better returns because they offer stronger performance, greater flexibility, and a more mature software ecosystem.

Morgan Stanley’s analysis compares the performance of several NVIDIA AI platforms against custom silicon from Google and Amazon. The firm measured performance using TFLOPs per watt, which reflects how much computing power a chip can deliver for each watt of energy consumed.

In the comparison, NVIDIA’s upcoming Vera Rubin GPU using FP4 precision leads the group with a score of 19.5 TFLOPs per watt. The Vera Rubin FP8 version follows with a score of 6.8. NVIDIA’s GB300 based on Blackwell reaches 6.0, while the older H100 based on Hopper scores 3.1.

By comparison, Google’s TPUv7 using FP8 precision scores 4.3 TFLOPs per watt, while Amazon’s Trainium 3 reaches 2.5. Based on these figures, Google’s latest TPU sits between NVIDIA’s Hopper and Blackwell generations, while Amazon’s Trainium 3 falls below the H100 in this specific efficiency comparison.

The findings suggest that NVIDIA’s Blackwell and future Rubin platforms could remain highly attractive for companies building large-scale AI clusters, even if the initial capital expenditure is higher. For hyperscalers, the decision is not only about buying cheaper chips. It is also about how much AI training and inference capacity they can extract from every megawatt of power.

Still, the market is also beginning to look beyond traditional performance-per-watt comparisons. As AI workloads shift heavily toward inference, some infrastructure providers are evaluating chips based on the cost of generating tokens, rather than only the hourly rental cost of a GPU or its theoretical compute output.

One example comes from AI infrastructure company Nebius, where experts have highlighted cost per million tokens as a useful metric for comparing AI accelerators. Estimates cited in the discussion suggest that Groq’s AI chips may generate tokens at a cost of around 5 to 10 cents on this metric, while NVIDIA’s Blackwell chips are estimated at around 25 cents. Groq’s chips are also said to reach up to 800 tokens per second, compared with roughly 450 tokens per second for Blackwell in the same discussion.

This shows that the AI chip market is becoming more complex. NVIDIA continues to dominate in broad AI training, high-performance computing, and full-stack infrastructure, but specialized chips are trying to compete in areas where speed, latency, or token-generation cost matter more than general-purpose flexibility.

For now, Morgan Stanley’s view reinforces a key argument in NVIDIA’s favor: the company’s GPUs may demand higher investment upfront, but their efficiency and performance can make them a strong choice for the largest AI data center deployments.

As cloud providers, AI labs, and enterprise customers race to expand computing capacity, the battle between NVIDIA GPUs and custom AI chips from hyperscalers is likely to intensify. Cost will remain a major factor, but energy efficiency, token economics, software support, and real-world workload performance may ultimately decide which platforms win the next phase of AI infrastructure growth.