Transformer Accelerator Brings Large AI Models To Devices

➀ Researchers at Sejong University developed STAU, a hardware accelerator enabling edge devices to run large AI models like BERT and GPT with 5.18× speedup over CPUs and 97% accuracy;➁ The design uses a Variable Systolic Array (VSA) and Radix-2 softmax optimization to reduce computational complexity and power consumption, cutting processing time by 68% for long inputs;➂ Implemented on FPGA with a custom 16-bit floating-point format, STAU supports multiple transformer models via software updates, advancing on-device AI deployment without cloud dependency.

Related Articles