CXL Unleashed: From Lab Breakthroughs to Mainstream Hardware Adoption

CXL hits the reset button—and it’s gearing up for mainstream adoption. As generative AI explodes and hyperscale data centers race to keep up, the way components connect and share data is becoming just as important as raw compute. By 2028, Compute Express Link (CXL) is expected to be present in roughly 90% of new server designs, signaling a major shift toward open, flexible interconnects that can match the pace of high-performance computing.

The urgency is clear. GenAI training and inference workloads devour memory bandwidth and demand ultra-low latency at massive scale. Traditional architectures strain under these conditions, pushing the market toward interconnect technologies that deliver faster data movement and higher throughput. That’s exactly where CXL comes in. Built as an open, standardized protocol on top of PCIe, CXL enables memory expansion, shared memory resource pooling, and cache-coherent communication across heterogeneous compute platforms. In practice, that means CPUs, GPUs, and accelerators can access and manage memory more efficiently, unlocking new levels of utilization and scalability for AI and HPC environments.

Today’s AI data centers often rely on tightly integrated, proprietary interconnect stacks. Nvidia’s portfolio—covering technologies like NVLink, NVSwitch, NVLink-C2C, InfiniBand, and Ethernet solutions tuned for AI fabrics—delivers staggering bandwidth and ultra-low latency to synchronize GPUs, CPUs, and DPUs. This vertically integrated approach has powered the current generation of closed AI superclusters, enabling highly coordinated parallel processing at scale.

CXL offers a complementary, open path forward. Its value lies in disaggregating and pooling memory, improving utilization across diverse hardware, and making it easier to scale systems without ripping and replacing entire racks. As the ecosystem matures, expect CXL to enable:
– Memory expansion and tiering to reduce bottlenecks for large model training and real-time inference
– Resource pooling across servers, improving utilization and lowering total cost of ownership
– More flexible composable infrastructure, allowing operators to dial up memory or accelerators as needed
– Coherent communication between CPUs and accelerators for faster, more efficient data access

The “restart phase” for CXL marks a transition from early proofs of concept to accelerated, production-grade adoption. Several factors are converging: advancing CXL specifications, broader CPU and accelerator support, emerging memory expanders and switches, and growing software enablement across operating systems, hypervisors, and orchestration stacks. Together, these pieces set the stage for CXL to become a foundational layer of next-generation data centers.

What this means for buyers and builders of AI infrastructure is choice. Proprietary GPU fabrics will continue to dominate certain ultra-high-performance clusters, but open CXL-based architectures will broaden deployment models and help right-size memory and compute for a wider range of workloads. The endgame is a hybrid landscape where closed, high-performance fabrics coexist with open, scalable memory-centric designs—each used where it delivers the most value.

As generative AI scales and the cost of moving data becomes as critical as compute itself, CXL’s promise of low-latency, high-throughput, and shared memory resources is hard to ignore. With adoption projected to reach the vast majority of new servers by 2028, the interconnect is on track to redefine how data centers architect performance, efficiency, and flexibility in the AI era.