Huawei Preps Ascend 910C To Tackle NVIDIA's H100 In China's Domestic AI Market 1

China’s 5nm AI Breakthrough: Local Chipmakers Tape Out Twin Chips for Model Training and AI PCs

China’s AI hardware ecosystem is accelerating again, with reports pointing to a new wave of 5nm GPUs designed to boost both training and on-device AI. According to industry chatter, Anfu Technology has teamed up with Xiangdi to develop a fresh generation of Fuxi GPUs that aim to deliver high parallel-compute performance while cutting reliance on overseas suppliers.

Two chips are currently in the pipeline, each tuned for different segments. One is geared toward rendering and AI PC workloads, and the other targets AI inference and model deployment at the edge with an onboard NPU. The shared headline feature is a 5nm process, which, if realized, would mark a notable step forward for China’s domestic chip ambitions.

Early performance targets are ambitious. The next-gen Fuxi lineup is reportedly aiming for up to 160 TFLOPS of FP32 compute, a figure that underscores the focus on parallel workloads common in AI training and high-performance computing. While 5nm fabrication has been a sticking point for local manufacturers, the report leaves room for interpretation on foundry sourcing. There is speculation about leveraging advanced nodes outside the mainland, but this has not been confirmed. Regardless, the push signals a clear intent: reduce exposure to supply constraints and build a robust, homegrown AI stack.

Here’s how the two chips are positioned based on what’s known so far:
– Fuxi A0: A rendering-first GPU designed for AI PC scenarios and graphics-heavy tasks.
– Fuxi B0: An AI-centric processor featuring an integrated NPU for efficient inference and end-side deployment. It’s expected to support mainstream AI models such as DeepSeek R1, pointing to practical, ready-to-run capabilities rather than just lab demos.

The broader context is equally important. Beijing’s industrial strategy has prioritized self-sufficiency in critical computing technologies, particularly in light of limited access to certain high-end accelerators. Domestic champions in AI hardware have been encouraged to fill the gap, and this Anfu–Xiangdi effort fits squarely within that narrative. If the 5nm claim and FP32 targets bear out, it would represent a significant leap in attainable performance for local AI developers across training, inference, and mixed workloads.

That said, several questions remain open. Timelines for tape-out and volume production have not been detailed. Memory bandwidth, interconnects, software stacks, and ecosystem support will all be decisive in determining how these chips perform in real-world deployments. Compatibility with popular frameworks, optimized kernels, and driver maturity will matter just as much as raw TFLOPS when it comes to training stability and inference efficiency.

What this means for the market:
– A stronger domestic alternative: Local players in cloud, enterprise, and consumer AI PCs could benefit from reduced procurement risk and better alignment with domestic supply chains.
– Competitive pressure: Even without full specs, performance targets like 160 TFLOPS FP32 suggest a drive to compete in segments that demand heavy parallelism, from LLM fine-tuning to advanced rendering.
– Edge and end-side AI: The NPU-equipped B0 positions itself for model deployment outside the data center, supporting a growing trend toward on-device intelligence and privacy-first workflows.

In short, Anfu Technology and Xiangdi’s next-gen Fuxi GPUs hint at a meaningful step toward higher-performance, locally supported AI hardware built on advanced process technology. While details are still emerging, the direction is clear: more compute, tighter integration for AI workloads, and a push to anchor China’s AI progress on a largely self-reliant technology stack. Keep an eye on official specifications, foundry disclosures, and software ecosystem announcements in the coming months to see how these chips stack up in practice.