Recent #Cerebras news in the semiconductor industry

4 months ago

➀ Cerebras Systems' wafer-scale AI chip (WSE-3) outperforms the fastest GPUs by 57 times in executing DeepSeek-R1 models with 70 billion parameters.

➁ Cerebras CEO Andrew Feldman states that enterprise clients are highly enthusiastic about DeepSeek's new R1 inference model, with a surge in demand within ten days of its launch.

➂ The WSE-3 chip, made on a 12-inch wafer with TSMC's 5nm process, has 4 trillion transistors, 900,000 AI cores, 44GB on-chip SRAM, and a total memory bandwidth of 21PB/s, with a peak performance of 125 FP16 PetaFLOPS.

➃ DeepSeek-R1 offers performance comparable to advanced inference models from OpenAI at a low training cost and has been open-sourced, allowing tech firms to build AI applications and chip manufacturers to optimize for the model.

➄ Andrew Feldman emphasizes that while DeepSeek poses some risks, users should exercise basic judgment, as seen with the use of electric saws.

AI ChipCerebras
9 months ago
➀ Cerebras Systems introduces the WSE-3 AI chip, designed for training the largest AI models with 5nm technology and 4 trillion transistors. ➁ The chip features 900,000 AI-optimized cores, offering 125 petaFLOPS of peak AI performance. ➂ Cerebras targets the inference market with its new product, claiming to generate 1,800 tokens per second, significantly outperforming Nvidia's H100. ➃ The company utilizes SRAM for high bandwidth, achieving 21 PBps, contrasting with Nvidia's HBM3e at 4.8 TBps. ➄ Cerebras plans to support more models and aims to provide competitive pricing, starting at 10 cents per million tokens.
AI芯片Cerebras推理服务
9 months ago
➀ Cerebras Systems introduces the WSE-3 AI chip, designed for training the largest AI models with 5nm technology and 4 trillion transistors. ➁ The chip features 900,000 AI-optimized cores, offering 125 petaFLOPS of peak AI performance. ➂ Cerebras targets the inference market with its new product, claiming to generate 1,800 tokens per second, significantly outperforming Nvidia's H100. ➃ The company utilizes SRAM for high bandwidth, achieving 21 PBps, contrasting with Nvidia's HBM3e at 4.8 TBps. ➄ Cerebras plans to support more models and aims to provide competitive pricing, starting at 10 cents per million tokens.
AI芯片Cerebras推理服务