After  I post my last article โ€œ๐——๐—ผ ๐˜†๐—ผ๐˜‚ ๐—ธ๐—ป๐—ผ๐˜„ ๐—ต๐—ผ๐˜„ ๐—บ๐˜‚๐—ฐ๐—ต ๐—ถ๐˜ ๐—ฐ๐—ผ๐˜€๐˜๐˜€ ๐˜๐—ผ ๐˜๐—ฟ๐—ฎ๐—ถ๐—ป ๐—Ÿ๐—ฎ๐—ฟ๐—ด๐—ฒ ๐—Ÿ๐—ฎ๐—ป๐—ด๐˜‚๐—ฎ๐—ด๐—ฒ ๐— ๐—ผ๐—ฑ๐—ฒ๐—น๐˜€ (๐—Ÿ๐—Ÿ๐— ๐˜€)?โ€, I got lot of replies & queries that โ€œwhy & where these costs are goingโ€. So, with this article Iโ€™m trying to explain this. I hope everyone will get answer of their queries.

We estimated a cost breakdown to develop key frontier models such as GPT-4 and Gemini Ultra, including R&D staff costs and compute for experiments. We found that most of the development cost is for the hardware at 47โ€“67%, but R&D staff costs are substantial at 29โ€“49%, with the remaining 2โ€“6% going to energy consumption.

If the trend of growing training costs continues, the largest training runs will cost more than a billion dollars by 2027

Hardware costs include AI accelerator chips (GPUs or TPUs), servers, and interconnection hardware

We also estimate the energy consumption of the hardware during the final training run of each model.

Using this method, we estimated the training costs for 45 frontier models (models that were in the top 10 in terms of training compute when they were released) and found that the overall growth rate is 2.4x per year.