Deception Unveiled: DeepSeek’s Training Costs Skyrocket Beyond Expectations

The recent buzz surrounding the costs involved in training DeepSeek’s R1 model has sent shockwaves through the markets, largely due to misleading information that exaggerated the efficiency of their operations. The true cost figures emerging from detailed analysis are indeed startling, and they paint a different picture than what was initially perceived.

A report by SemiAnalysis has delved deep into the actual expenses faced by DeepSeek, debunking the myth that they achieved optimal efficiency with minimal reliance on high-end compute resources from industry giants like NVIDIA. Initially, the industry was led to believe that DeepSeek’s R1 model only required a budget of around $5 million, similar to OpenAI GPT’s training costs, causing a stir in retail markets and ripples in the US stock exchange. In light of new findings, however, these numbers appear to grossly underestimate the reality.

For background, DeepSeek started as an offshoot of the Chinese hedge fund High-Flyer. According to SemiAnalysis, the project acquired 10,000 units of NVIDIA’s A100 in 2021, a time when export restrictions were less stringent. Once spun off as a separate entity, DeepSeek significantly ramped up its hardware capabilities.

The report notes that DeepSeek currently employs around 10,000 units of NVIDIA’s specialized H800 AI GPUs designed for the Chinese market and an equivalent number of the higher-end H100 AI chips. Additionally, they have invested in NVIDIA’s H20 AI accelerators. These resources form a shared pool with High-Flyer, supporting activities ranging from trading and inference to training and research. The total capital expense for DeepSeek reaches an astonishing $1.6 billion, with operating costs hovering around $944 million, vastly exceeding initial estimates by a factor of 400.

The original cost figure is now claimed to represent only a segment of the entire training endeavor, likely related to the operation of the final model. Despite the financial misconceptions, DeepSeek demonstrated remarkable prowess in attracting and utilizing local talent, particularly by holding recruitment events at prestigious universities and offering lucrative salaries upwards of $1.3 million to key team members. This strategic talent acquisition has been pivotal in enabling DeepSeek’s R1 model to emerge as a formidable competitor to renowned players like OpenAI.

With extensive testing and insights from SemiAnalysis, it’s clear that the supposed thriftiness of DeepSeek is more myth than reality. The full extent of their financial commitments sheds light on last week’s tumultuous market events and illustrates the significant investment required to rival industry leaders in AI innovation.