Thanks for the numbers! Isn't it more likely that the amount of power/heat generated per rack will stay constant over each upgrade cycle, and the upgrade simply unlocks a higher amount of service revenue per rack?
Not in the last few years. CPUs went from ~200W TDP to 500W.
And they went from zero to multiple GPUs per server. Tho we might hit "the chips can't be bigger and the cooling can't get much better" point there.
The usage would be similar if it was say a rack filled with servers full of bulk storage (hard drives generally keep the power usage similar while growing storage).
But CPU/GPU wise, it's just bigger chips/more chiplets, more power.
I'd imagine any flattening might be purely because "we have DC now, re-building cooling for next gen doesn't make sense so we will just build servers with similar power usage as previously", but given how fast AI pushed the development it might not happen for a while.
I've been in university research computing for 15 years, so large enough (~900 nodes) we need a dedicated DC, but not at the same scale as others around here.
Our racks are provisioned so that there are two independent rails, which each can support 7kW. Up until the last few years, this was more than enough power. As CPU TDPs increased, we started to need to do things like not connect some nodes to both redundant rails or mix disk servers into compute racks to keep under 7kW/rack.
A single HGX B300 box has 6x6kW power supplies. Even before we get to paying the (high) power bills, it's going to cost a small fortune to just update the racks, power distribution units, UPS, etc... to even be able to support more than a handful of those things
> Isn't it more likely that the amount of power/heat generated per rack will stay constant over each upgrade cycle,
Power density seems to grow each cycle. But eventually your DC hits power capacity limits, and you have to leave racks empty because there's no power budget.