Elon Musk's xAI startup is in the process of upgrading its Colossus AI supercomputer cluster, doubling the GPU power from 100,000 NVIDIA Hopper AI GPUs to an impressive 200,000.
Colossus, recognized as the world's largest AI supercomputer, is instrumental in training xAI's Grok family of large language models (LLMs) and provides chatbots for X Premium subscribers.
The Colossus facility, completed in just 122 days, is a remarkable achievement that has been acknowledged by NVIDIA CEO Jensen Huang, who referred to Elon Musk as 'superhuman' for this feat.
NVIDIA has highlighted its partnership with xAI, noting that the state-of-the-art supercomputer was built in an unprecedented 122 days, significantly faster than the typical timeframe for such systems.
During the training of the large Grok model, Colossus has demonstrated unparalleled network performance, maintaining high data throughput and low latency, which is critical for AI workloads.
Elon Musk himself praised the Colossus as the most powerful training system in the world, and an xAI spokesperson emphasized the importance of NVIDIA's Hopper GPUs and Spectrum-X in enabling massive-scale AI model training.