AMD Unveils MI350P: First PCIe Instinct Accelerator in Four Years

AMD has introduced the Instinct MI350P PCIe, a new GPU accelerator built specifically for enterprise AI workloads and one of the company’s first major returns to a PCIe-based Instinct design in years. The big idea behind MI350P is straightforward: give data centers a powerful, modern AI accelerator that can be added to existing servers without requiring a full platform overhaul.

A drop-in PCIe AI accelerator for real-world data centers

The Instinct MI350P PCIe is designed as a dual-slot, server-focused card intended for standard air-cooled systems. That matters for enterprises that want to expand on-premises inference capacity while staying within current rack space, power delivery, and cooling limits. Instead of pushing customers toward specialized infrastructure, MI350P aims to scale AI compute through a familiar PCIe deployment model.

AMD positions the MI350P as a practical option for organizations preparing for the “agentic AI” era, where inference workloads and autonomous AI agents can drive higher demand across internal systems. With MI350P, the target is faster deployment and lower friction for upgrading AI capability inside an existing data center.

Key MI350P highlights aimed at AI throughput

AMD is emphasizing performance in lower-precision formats commonly used in modern AI inference and training workflows. The MI350P includes native support for MXFP6 and MXFP4, along with sparsity acceleration for widely used 8-bit and 16-bit formats. AMD also highlights an open ecosystem with low- and no-cost development stack options intended to simplify deployment and reduce operating expenses.

On the performance side, AMD estimates 2,299 TFLOPS with up to 4,600 peak TFLOPS at MXFP4, positioning it as extremely high throughput for an enterprise PCIe accelerator. Memory is another major focus: the MI350P is rated for 144GB of HBM3E and up to 4TB/s of memory bandwidth.

Architecture and specs: CDNA 4, 3nm compute chiplets, and 144GB HBM3E

The Instinct MI350P is based on AMD’s CDNA 4 architecture. It uses a chiplet approach on TSMC’s 3nm process in a 4 XCD configuration, described as half the amount used in MI350X. It also includes a single IO die produced on TSMC’s 6nm FinFET process.

Compute resources include 128 compute units, totaling 8,192 stream processors and 512 matrix cores. Peak clock is listed at 2200 MHz, and AMD cites 73 billion transistors for the chip.

For cache and memory, MI350P includes 128MB of last-level cache (Infinity Cache) and 144GB of HBM3E across a 4096-bit memory interface, delivering up to 4TB/s of bandwidth. By comparison, MI350X is noted as carrying 288GB HBM3E on an 8192-bit interface.

Power, size, and server deployment design

The PCIe card is 10.5 inches (267mm) long and uses a passive-cooled design intended for server airflow. Power is delivered via a 16-pin connector, with a 600W total board power rating. AMD also notes it can be configured down to 450W, offering some flexibility for data centers managing power envelopes.

MI350P performance figures across common AI precisions

AMD lists the following performance targets for the Instinct MI350P PCIe:

4.6 PFLOPs MXFP4
4.6 PFLOPs MXFP6
2.3 PFLOPs MXFP8
2.3 PFLOPs FP16 (Sparsity)
1.15 PFLOPs FP16
72 TFLOPs FP16
72 TFLOPs FP32
36 TFLOPs FP64
2.3 POPs INT8
4.6 POPs INT8 (Sparsity)
1.15 PFLOPs BFloat16
2.30 PFLOPs BFloat16 (Sparsity)

These numbers reinforce AMD’s message that the MI350 family, including MI350P, is built to accelerate multiple enterprise AI precision formats—especially MXFP6 and MXFP4—where throughput can translate directly into better inference density and performance per server.

Competitive landscape: aimed at PCIe AI accelerator buyers

MI350P is positioned against NVIDIA’s H200 NVL, a PCIe-based accelerator with 141GB of HBM3E memory using the Hopper H200 GPU. The H200 NVL is referenced as costing roughly $30,000 to $40,000 USD. NVIDIA also has a newer RTX PRO 6000 Blackwell server edition mentioned, but it is described as using a standard chip rather than the company’s true server-focused option, and it comes with 96GB of GDDR7 memory.

Availability and software stack

AMD says Instinct MI350P PCIe GPUs are available through various partners. The card is presented as part of an open ecosystem and includes an enterprise-ready AI software stack with ROCm support, targeting organizations that want a more flexible approach to AI deployment while still demanding data center-grade performance and stability.

AMD Unveils MI350P: First PCIe Instinct Accelerator in Four Years

Share this:

Related Posts: