Huawei is sharpening its edge in the domestic AI race, unveiling a multi-year roadmap that puts homegrown technology front and center. The company outlined an ambitious plan to expand its Ascend lineup through 2028, doubling down on an internal tech stack and in-house high-bandwidth memory to compete head-on with leading AI accelerators.
At its recent industry event, Huawei detailed successors to the Ascend 910C that target both inference and training workloads, with clear performance and memory upgrades rolling out each year. The strategy revolves around two pillars: pushing low-precision compute for efficiency and scaling bandwidth and capacity with self-developed HBM.
Ascend 950PR: the next step in inference performance
– First Ascend chip to integrate Huawei’s self-built HBM
– Supports low-precision formats up to FP8
– Compute: 1 PFLOPS FP8, 2 PFLOPS FP4
– Interconnect bandwidth: 2 TB/s
– HBM (HiBL 1.0): 128 GB capacity, 1.6 TB/s bandwidth
– Target workloads: inference-centric tasks such as prefill and recommendation
Huawei is also preparing its second-generation HBM, aiming to boost both capacity and throughput:
– HiZQ 2.0 HBM: 144 GB capacity, 4 TB/s bandwidth
Ascend 950DT: training-focused follow-up
– Planned for Q4 2026
– Training-oriented design
– Uses HiZQ 2.0 HBM for higher memory capacity and bandwidth than the 950PR
Ascend 960: bandwidth and compute scale up
– Planned for Q4 2027
– Interconnect bandwidth: 2.2 TB/s
– Memory: 288 GB (likely HiZQ 2.0)
– Memory bandwidth: 9.6 TB/s
– Compute: 2 PFLOPS FP8, 4 PFLOPS FP4
Ascend 970: next-level leap by 2028
– Set for 2028
– Positioned to deliver significant jumps in both compute and memory, extending the platform’s capabilities for large-scale AI
Why this roadmap matters
– Homegrown stack: By adopting self-built HBM and internal designs, Huawei reduces reliance on external components while optimizing for its AI software ecosystem.
– Precision for efficiency: Emphasis on FP8 and FP4 aligns with modern AI trends, increasing throughput for both inference and training at lower power and cost.
– Balanced portfolio: The 950PR zeroes in on inference-heavy scenarios, while the 950DT and 960 raise the ceiling for training, helping data centers scale across use cases.
– Domestic demand: With an extended pipeline through 2028, Huawei is signaling it can supply sustained AI compute growth for China’s expanding AI industry.
In short, Huawei’s AI chip roadmap marks a clear, iterative march toward higher bandwidth, larger memory pools, and faster low-precision performance. From the 950PR’s in-house HBM debut to the 970’s anticipated leap, the Ascend series is being positioned to power everything from recommendation engines to large-scale model training across the next several years.






