NVIDIA’s latest GeForce RTX 5090 GPU is making waves with its remarkable performance, significantly outpacing AMD’s RX 7900 XTX in inference tasks, thanks to the advanced fifth-generation Tensor Cores. This development in GPU technology signals a booming potential for consumers to run sophisticated LLM models right on their desktops, further cementing NVIDIA’s position at the forefront of AI advancements.
In the realm of AI tasks, particularly when leveraging DeepSeek’s Reasoning Models, NVIDIA’s newest offerings make the process both accessible and highly efficient. As the demand for running high-end language models locally grows, NVIDIA and AMD are stepping up to create conducive environments for these processes. Following AMD’s showcase of the RDNA 3’s capabilities with the DeepSeek R1 LLM model, NVIDIA has answered back. The benchmark scores for their RTX Blackwell GPUs are impressive, demonstrating the GeForce RTX 5090’s superior performance.
The GeForce RTX 5090 boasts a significant edge across various DeepSeek R1 models, surpassing not only AMD’s RX 7900 XTX but also its previous-generation GPUs. Notably, it can manage up to 200 tokens per second in Distill Qwen 7b and Distill Llama 8b, essentially doubling the throughput of the RX 7900 XTX. These results highlight NVIDIA’s leadership in AI performance and suggest that consumer PCs will increasingly feature edge AI capabilities.
For those looking to harness the power of DeepSeek R1 with NVIDIA’s RTX GPUs, the company has rolled out a dedicated guide that makes the process as straightforward as using any online chatbot. Here’s how you can get started: NVIDIA has made the 671-billion-parameter DeepSeek-R1 model accessible as a microservice preview through its NVIDIA NIM platform on build.nvidia.com. This microservice is capable of delivering up to 3,872 tokens per second on an NVIDIA HGX H200 system, offering developers a secure and efficient platform to build specialized agents.
Soon to be released as part of the NVIDIA AI Enterprise software platform, the application programming interface (API) will allow developers to test and experiment with ease. This NIM microservice integrates smoothly with standard APIs, granting enterprises the ability to run it on their choice of accelerated computing infrastructure while ensuring security and privacy.
By leveraging NVIDIA’s NIM, developers and AI enthusiasts can run this powerful AI model on their local machines without compromising on performance. This local execution not only fortifies data security but also boosts performance, provided the hardware specifications are met. NVIDIA’s push for AI innovation continues to create exciting opportunities for consumers and developers alike.






