A presenter on stage showcases a microchip with '基于花港架构 AI 推一体 超智融合' text alongside at the MUSA 2025 event.

Next-Gen Leap: 15x Faster Gaming, 50x Better Ray Tracing, DX12 Ultimate—Arriving Next Year

Chinese GPU maker Moore Threads has revealed two next-generation graphics processors built on its new Flower Harbor (Huagang) architecture: Lushan, aimed at gaming and content creation, and Huashan, designed for large-scale AI workloads. Announced during the company’s MUSA 2025 summit, the new platform is positioned as a major step forward for Moore Threads, with the company promising significant gains in gaming performance, ray tracing, AI compute, and memory capacity.

At the heart of both GPUs is the Flower Harbor architecture, which Moore Threads says brings a redesigned compute unit to increase compute density by 50%, alongside a 10% improvement in energy efficiency. The architecture supports a wide range of compute formats from FP4 all the way up to FP64, and introduces Moore Threads’ own mixed low-precision formats (MTFP6 and MTFP4, along with MTFP8) to better target modern AI and hybrid workloads.

Supported compute formats include FP64, FP32, TF32, FP16, BF16, FP8, FP6, FP4, INT8, plus MTFP8, MTFP6, and MTFP4.

For AI and data center use, Moore Threads is also emphasizing software and scaling. The architecture includes an asynchronous programming model and an ultra-large-scale interconnect strategy intended for massive clusters. Using its MTLink high-speed interconnect technology, Moore Threads says it can scale beyond 100,000 GPUs within a single cluster, a clear signal that Huashan is being built with large training and inference deployments in mind.

Lushan is positioned as the company’s upcoming gaming GPU family and the successor to its current MTT consumer lineup. While Moore Threads isn’t sharing full product specifications yet, it has outlined the performance targets it expects from Lushan-based graphics cards, which are intended to replace older MTT S80 and MTT S90 models.

According to Moore Threads’ own projections, Lushan could deliver:
15x higher AAA gaming performance
50x improved ray tracing performance
64x higher AI compute performance
16x geometry processing performance
4x texture fill rate
8x atomic memory access performance
4x memory capacity

Beyond raw performance, one of the biggest upgrades Moore Threads is highlighting is modern API support. The company says the new architecture is fully compatible with current graphics APIs, including DirectX 12 Ultimate, addressing a key weakness in earlier consumer products. It’s also promoting an AI generative rendering approach within its UniTE unified rendering architecture, alongside a new hardware ray-tracing engine meant to open the door to advanced techniques such as neural rendering and path tracing.

On the AI side, Huashan appears to be a chiplet-based design and is described as featuring two chiplets with eight HBM sites. Moore Threads compares Huashan’s capabilities with NVIDIA’s Hopper and Blackwell-class GPUs, claiming floating-point compute that’s close to Blackwell B200, bandwidth that matches B200 levels, and memory access capacity that it says can exceed the Blackwell chip in certain scenarios.

Memory is another major theme across both product lines. Moore Threads says the new GPUs will bring 4x the memory capacity compared to its previous generation. Since the MTT S80/S90 shipped with 16GB of GDDR6, that points to potential configurations reaching up to 64GB in upcoming consumer solutions, depending on final board designs.

As for availability, Moore Threads is targeting 2026 for the first graphics cards in the Lushan gaming lineup, with Huashan AI products expected around a similar timeframe.