The beginning of the article was good, but the analysis of DeepSeek and what it means for Nvidia is confused and clearly out of the loop.
* People have been training models at <fp32 precision for many years, I did this in 2021 and it was already easy in all the major libraries.
* GPU FLOPs are used for many things besides training the final released model.
* Demand for AI is capacity limited, so it's possible and likely that increasing AI/FLOP would not substantially reduce the price of GPUs
His DeepSeek argument was essentially that experts who look at the economics of running these teams (eg. ha ha the engineers themselves might dabble) are looking over the hedge at DeepSeek's claims and they are really awestruck
Where do you have this "capacity" limit from? I can get as many H100s from GCP or wherever as I wish, the only thing that is capacity limited are 100k clusters ala ELON+X, but what DeepSeek (and the recent evidence of a limit in pure base-model scaling) shows is that this might actually not be profitable, and we end up with much smaller base models scaled at inference time. The moat for Nvidia in this inference time scaling is much smaller, also you don't need the humongous clusters for that either you can just distribute the inference (and in the future run it locally too).