New architecture for AI datacentres proposed

➀ Panmnesia proposes a CXL-over-Xlink datacenter architecture combining GPU-optimized interconnects with CXL memory sharing, achieving 5.3x faster AI training and 6x lower inference latency versus PCIe/RDMA systems;➁ Key enhancements include independent compute/memory scaling, dynamic resource pooling, hierarchical memory integration (HBM+CXL), and cascading CXL 3.1 switches for scalable low-latency fabrics;➂ The architecture reduces communication overhead through accelerator-optimized links (sub-100ns latency) and enables petabyte-scale memory access for AI workloads, addressing bottlenecks in traditional GPU clusters.

Related Articles