➀ Broadcom has announced the general availability of Sian2, its 200Gbps per lane PAM-4 DSP PHY that supports 100Gbps electrical and 200Gbps optical interfaces. Sian2 and Sian DSPs are designed to enable pluggable modules with 200G/lane interfaces for next-generation AI clusters. ➁ The migration from 400G/800G links with 100G/lane optics to 800G/1.6T links with 200G/lane optics is necessary for higher bandwidth, lower power, lower latency, and lower cost. ➂ Sian2 and Sian DSPs are optimized for 800G and 1.6T optical module platforms, doubling the bandwidth with lower power, lower latency, and lower cost per bit to facilitate AI data center scale.
Recent #InfiniBand news in the semiconductor industry
➀ The article discusses the evolution of large models from billion-parameter language models to trillion-parameter multimodal models, necessitating a significant boost in underlying computing capabilities for ultra-thousand-card clusters. ➁ It describes the network architecture of ByteDance, Baidu, Alibaba, and Tencent's AI clusters, highlighting the use of advanced technologies like Broadcom Tomahawk 5 chips, InfiniBand, and RoCE. ➂ The article also delves into the innovative HPN-AIPod architecture of Baidu and Alibaba's HPN7 network, showcasing their high-performance and scalable designs.
➀ RDMA allows direct access to remote memory without kernel intervention, offering high throughput and low latency. ➁ RDMA protocols include InfiniBand, RoCE, and iWARP, each with unique advantages and deployment scenarios. ➂ Load balancing in large-scale networks is challenging due to the prevalence of large data flows, necessitating advanced techniques like PLB and SDN-based traffic engineering.