Alibaba’s Zhenwu M890 Takes Aim at NVIDIA Hopper With 3x H20 Performance and 144GB HBM3

Alibaba Unveils Zhenwu M890 AI Chip and Qwen3.7-Max Model for Agentic AI Workloads

Alibaba is stepping deeper into the global AI race with the launch of its new Zhenwu M890 AI chip and Qwen3.7-Max large language model, two technologies built to support the next wave of agentic AI systems. The announcement highlights Alibaba Cloud’s growing focus on AI inference, autonomous agents, advanced coding assistants, and large-scale enterprise AI deployments.

The Zhenwu M890 is based on Alibaba’s self-developed PPU, or Parallel Processing Unit, architecture. It also includes a dedicated Transformer core engine, making it specifically tuned for modern AI workloads where large language models, multi-agent systems, and real-time inference performance are critical.

Alibaba says the Zhenwu M890 delivers 0.6 PFLOPs of FP16 computing performance. That places it in a performance class comparable to NVIDIA’s A100 for half-precision workloads, while the company claims it is three times faster than NVIDIA’s Hopper H20 solution in targeted scenarios. Compared with Alibaba’s previous-generation Zhenwu chips, the M890 is said to provide around a 3x jump in compute performance.

Memory is another major upgrade. The Zhenwu M890 comes with 144 GB of HBM3 memory, a 50% increase over the earlier Zhenwu 810E, which featured 96 GB. Interconnect bandwidth has also improved, rising from 700 GB/s on the 810E to 800 GB/s on the M890.

The chip supports FP32, FP16, FP8, and FP4 data formats, giving it flexibility for different AI training and inference needs. Support for lower-precision formats such as FP8 and FP4 is especially important for running large AI models more efficiently, reducing memory pressure, and improving throughput in inference-heavy environments.

Alibaba is not positioning the Zhenwu M890 as a standalone chip only. The company is building a broader AI infrastructure ecosystem around it. A key part of this strategy is ICN Switch 1.0, a new interconnect chip designed to deliver 25.6 Tb/s of bandwidth with point-to-point latency of less than 150 nanoseconds. This high-speed interconnect is intended to help data centers support massive agent concurrency, where many AI agents operate simultaneously across large clusters.

The new AI infrastructure stack also includes Alibaba’s Yitian Arm-based host CPU and Panmai networking cards. These components are designed to work together inside the Panjiu AL128 Supernode Server from Alibaba Cloud.

The Panjiu AL128 Supernode Server integrates 128 AI accelerators inside a single rack. Alibaba says this design can deliver bandwidth at the petabyte-per-second scale, making it suitable for large AI model inference, enterprise AI services, and high-concurrency agentic AI platforms.

Alibaba’s chip division has reportedly shipped around 560,000 Zhenwu AI chips so far. These chips are being used by more than 400 external customers across over 20 industries, showing that Alibaba is already building a commercial footprint for its AI hardware.

The company has also shared a roadmap for future Zhenwu AI accelerators. The Zhenwu 810E served as the first-generation baseline chip, offering training and inference support, 96 GB of memory, and 700 GB/s interconnect bandwidth.

The Zhenwu M890 is the next major step, bringing a fully upgraded parallel computing architecture, 144 GB of memory, 800 GB/s bandwidth, and an overall performance boost of around 3x.

Alibaba then plans to introduce the Zhenwu V900, which is expected to arrive with a deeper iteration of the company’s parallel computing architecture. The V900 is planned to deliver another 3x performance improvement, 216 GB of memory, and 1200 GB/s interconnect bandwidth.

Further ahead, the Zhenwu J900 is expected to bring a more significant architectural breakthrough. Alibaba is targeting continued performance gains as it works toward competing at the highest level of the international AI chip market.

Alongside its new AI hardware, Alibaba Cloud is introducing Qwen3.7-Max, a new large language model built for agentic AI, coding, reasoning, and long-duration autonomous task execution.

Qwen3.7-Max is designed to act as a powerful coding assistant, supporting everything from rapid frontend prototyping to complex software engineering across multiple files. This makes it relevant for developers, enterprise software teams, and AI-powered development platforms.

The model is also optimized for office productivity and business automation. Alibaba says Qwen3.7-Max can coordinate multi-agent workflows, allowing several specialized AI agents to work together on more complex tasks.

One of its most notable capabilities is long-horizon execution. According to Alibaba Cloud, Qwen3.7-Max can run autonomous agentic tasks continuously for up to 35 hours and manage more than 1,000 tool calls without a drop in performance. This is an important feature for advanced AI agents that need to plan, execute, check results, and continue operating across extended workflows.

The model has been optimized for several leading agent frameworks, including OpenClaw, Hermes Agent, Claude Code, Qwen Paw, and Qoder. This makes it easier for developers to use Qwen3.7-Max as the foundation for different AI agent systems.

Alibaba says Qwen3.7-Max performs strongly across major benchmarks covering coding, general-purpose agents, multilingual capabilities, and broader reasoning tasks. Benchmark data shared by the company shows competitive results in areas such as terminal-based coding, software engineering verification, multilingual software tasks, repository-level coding, scientific coding, and web development.

The launch of Qwen3.7-Max reflects a broader shift in the AI industry. Companies are moving beyond simple chatbot interactions toward agentic AI systems that can use tools, make decisions, execute multi-step workflows, and operate with greater independence. These systems require both powerful AI models and efficient infrastructure, which is why Alibaba is pairing its new model with its expanding Zhenwu chip ecosystem.

Qwen3.7-Max is expected to become available soon through Alibaba’s model service platform for developers and enterprise customers worldwide.

With the Zhenwu M890 AI chip and Qwen3.7-Max model, Alibaba is making a clear push into the future of agentic AI. The company is combining custom AI silicon, high-speed interconnects, rack-scale supernode systems, and advanced large language models into one integrated strategy.

As demand for AI inference, autonomous agents, and enterprise AI automation continues to grow, Alibaba’s latest hardware and software launches could help strengthen its position in the fast-moving artificial intelligence market.

Alibaba’s Zhenwu M890 Takes Aim at NVIDIA Hopper With 3x H20 Performance and 144GB HBM3

Share this:

Related Posts: