Former Tesla AI Director reproduces GPT-2 in 24 hours for only $672 — GPT-4 costs $100 million to train

❶ Andrej Karpathy, former Tesla AI director, has demonstrated that reproducing GPT-2 can be done in 24 hours for just $672 using a single 8XH100 node, significantly cheaper than the $100 million cost of training GPT-4. ❷ The project, llm.c, directly implements GPT training in C/CUDA, eliminating the need for complex environments and speeding up the process. ❸ Despite advancements, AI training costs are not decreasing; leading-edge models like GPT-4 still require substantial investment, raising concerns about the environmental impact due to high power consumption.

Related Articles