The article in question shows that a mere 8 big cores and 2 little ones can use ...

namibj · on Oct 26, 2021

Which is why I suggested comparing to the EPYC 73F3, which is a 5950X clocked at 3.5 - 4 GHz, with 4x the L3$, 4x the memory bandwidth (if you don't overclock it), and 5~6x the IO bandwidth.

We know a 5950X is roughly on-par with an M1 Max (at least ignoring the latter's 2 efficiency cores). If the occasional wins of the M1 Max are due to memory bandwidth, this should more-or-less turn the tables.

gigatexal · on Oct 25, 2021

I thought HBM was a power sucking tech?

dragontamer · on Oct 25, 2021

HBM is very low-clock speed and super efficient.

HBM's downside is that it requires many, many, many pins. Each channel is 1024-pins of communications (and more pins for power). In practice, the only thing that can make HBM work are substrates. (Typical chips have 4x to 6x HBM stacks, for well over 4096 pins to communicate, plus more pins for power / other purposes)

But HBM is among the lowest power technologies available. Turns out that clocking every pin at like 500MHz (while LPDDR5 is probably a 3200 MHz clock) saves a lot on power. Because DRAM has such high latency, the channel speed is more for parallelism more so than anything else. (DDR4 parallelizes RAM into 4-bank groups, each with 4-banks. All 16 can be accessed in parallel across the channel).

HBM just does this parallel access thing at a lower clock rate, to save on power. But spends way more pins to do so.

namibj · on Oct 26, 2021

Only if you measure by the GB of capacity. Flips around if you measure by the GB/s of bandwidth.