Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Any efficiency comparison involving Apples chips also has to factor in that Tim Cook keeps showing up at TSMCs door with a freight container full of cash to buy out exclusive access to their bleeding edge silicon processes. ARM may be a factor but don't underestimate the power of having more money than God.

Case in point, Strix Point is built on TSMC 4nm while Apple is already using TSMCs second generation 3nm process.



Let's do the math on M1 Pro (10-core, N5, 2021) vs HX370 (12-core, N4P, 2024).

Firestorm without L3 is 2.281mm2. Icestorm is 0.59mm2. M1 Pro has 8P+2E for a total of 19.428mm2 of cores included.

Zen4 without L3 is 3.84mm2. Zen4c reduces that down to 2.48mm2. Zen5 CCD is pretty much the same size as Zen4 (though with 27% more transistors), so core size should be similar. AMD has also stated that Zen5c has a similar shrink percent to Zen4c. We'll use their numbers. HX370 has 4P+8C for a total area of 35.2mm2. If being twice the size despite being on N4P instead of N5 like M1 seems like foreshadowing, it is.

We'll use notebookcheck's Cinebench 2024 multithread power and performance numbers to calculate perf / power / area then multiply that by 100 to eliminate some decimals.

M1 Pro scores 824 (10-core) and while they don't have a power value listed, they do list 33.6w package power running the prime95 power virus, so cinebench's power should be lower than that.

HX370 scored 1213 (12-core) and averaged 119w (maxing at a massive 121.7w and that's without running a power virus).

This gives the following perf/power/area*100 scores:

M1 Pro — 126 PPA

HX 379 — 29 PPA

M1 is more than 4.3x better while being an entire node behind and being released years before.


119W for hx370 looks extremely sus, seems to me more like the system level power consumption and not CPU-only.

According to phoronix [1,2], in their blender CPU test, they measured a peak of 33W.

Here max power numbers from some other tests that I know are multi-threaded:

--

Linux 6.8 Compilation: 33.13 W

LLVM Compilation: 33.25 W

--

If I plug in 33W into your equation, that would give us score of HX 370: 104 PPA

This supports the HX 370 being pretty power efficient, although still not as power efficient as M3.

[1] https://www.phoronix.com/review/amd-ryzen-ai-9-hx-370/3

[2] https://www.phoronix.com/review/amd-ryzen-ai-9-hx-370/4


https://www.notebookcheck.net/AMD-Zen-5-Strix-Point-CPU-anal...

They got those kinds of numbers across multiple systems. You can take it up with them I guess.

I didn't even mention one of these systems was peaking at 59w on single-core workloads.


I see what's going on, they have two HX370 laptops:

  Laptop  MC score  Avg Power
     P16      1213      113 W
     S16       921       29 W
  M3 Pro      1059    (30 W?)
They don't have M3 Pro power numbers, but I assume it is somewhere around 30W, seems like S16 has similar power efficiency as HX 370 at 30 W.

Any more power, and the CPU is much less power efficient, 300% increase in power for 30% increase in performance.


This is true for every CPU. Past a certain point power consumption scales quadratically with performance.


About cinebench-geekbench-spec: https://old.reddit.com/r/hardware/comments/pitid6/eli5_why_d... That's about Cinebench 20, an overview of Cinebench 24 cpu&gpu(!): https://www.cgdirector.com/cinebench-2024-scores/


Even with the M3 the difference is marginal in multi-threaded benchmarks, from the Cinebench link [1] someone posted earlier on the thread.

    Apple M3 Pro 11-Core - 394 Points per Watt
    AMD Ryzen AI 9 HX 370 - 354 Points per Watt
    Apple M3 Max 16-Core - 306 Points per Watt
And the Ryzen in on TSMC 4nm while the M3 is on 3nm. As parent is saying, a lot of the Apple Silicon hype was due to the massive upgrade it was over the Intel CPUs Apple was using previously.

[1]: https://www.notebookcheck.net/AMD-Zen-5-Strix-Point-CPU-anal...


Their efficiency tests use Cinebench R23 (as called out explicitly).

R23 is not optimized for Apple silicon but is for x86. The R24 numbers are actually what you need for a fair comparison, otherwise you put the Arm numbers at a significant handicap.


That the max should be worse than the m3 pro is a little bit shady.


Cinebench might not be the most relevant benchmark, it uses lots of scalar instructions with fairly high branch mispredictions and low IPC: https://chipsandcheese.com/2021/02/22/analyzing-zen-2s-cineb....


Power efficiency is a curve, and Apple may have its own reason not to make M1 Pro run at 110W as well


I think the OC might have mis-read the power numbers, 110 W is well into desktop CPU power range. Here is a excerpt from Anand Tech:

> In our peak power test, the Ryzen AI 9 HX 370 ramped up and peaked at 33 W.

https://www.anandtech.com/show/21485/the-amd-ryzen-ai-hx-370...


You can read the notebookcheck review for yourself.

https://www.notebookcheck.net/AMD-Zen-5-Strix-Point-CPU-anal...


Those 100W+ numbers are total system power. And that system has the CPU TDP set to 80W (far above AMD's official max of 54W). It also has a discrete 4070 GPU that can use over 100W on its own.


if x86 laptops have 90w of platform power, that’s a thing that’s concerning in itself, not a reasonable defense.

Remember, apple laptops have screens too, etc, and that shows up in the average system power measurements the same way. What's the difference in an x86 laptop?

I really doubt it's actually platform power, the problem is that x86 is boosting up to 35W average/60W peak per thread. 120W package power isn't unexpected, if you're boosting 3-4 cores to maximum!

And that's the problem. x86 is far far worse at race-to-sleep. It's not just "macos has better scheduling"... you can see from the 1T power measurements that x86 is simply drawing 2-3x the power while it's racing-to-sleep, for performance that's roughly equivalent to ARM.

Whatever the cause, whether it's just bad design from AMD and Intel, or legacy x86 cruft (I don't get how this applies to actual computational load though, as opposed to situations like idle power), or what... there is no getting around the fact that M2 tops out at 10W per core and a 8840HS or HX370 or Intel Meteor Lake are boosting to 30-35W at 1T loads.


I stacked the deck in AMD's favor using a 3-year-old chip on an older node.

Why is AMD using 3.6x more power than M1 to get just 32% higher performance while having 17% more cores? Why are AMD's cores nearly 2x the size despite being on a better node and having 3 more years to work on them?

Why are Apple's scores the same on battery while AMD's scores drop dramatically?

Apple does have a reason not to run at 120w -- it doesn't need to.

Meanwhile, if AMD used the same 33w, nobody would buy their chips because performance would be so incredibly bad.


You should try not to talk so confidently about things you don't know about -- this statement

> if AMD used the same 33w, nobody would buy their chips because performance would be so incredibly bad

Is completely incorrect, as another commenter (and I think the notebookcheck article?) point out -- 30w is about the sweet spot for these processors, and the reason that 110w laptop seems so inefficient is because it's giving the APU 80w of TDP, which is a bit silly since it only performs marginally better than if you gave it e.g. 30 watts. It's not a good idea to take that example as a benchmark for the APU's efficiency, it varies depending on how much TDP you give the processor, and 80w is not a good TDP for these


Halo products with high scores sell chips. This isn’t a new idea.

So you lower the wattage down. Now you’re at M1 Pro levels of performance with 17% more cores and nearly double the die area and barely competing with a chip 3 years older while on a newer, more expensive node too.

That’s not selling me on your product (and that’s without mentioning the worst core latency I’ve seen in years when going between P and C cores).


> if AMD used the same 33w, nobody would buy their chips because performance would be so incredibly bad

I’m writing this comment on HP ProBook 445 G8 laptop. I believe I bought it in early 2022, so it's a relatively old model. The laptop has a Ryzen 5 5600U processor which uses ≤ 25W. I’m quite happy with both the performance and battery life.


It's well known that performance doesn't scale linearly with power.

Benchmarking incentives on PC have long pushed X86 vendors to drive their CPUs at points of the power/performance curve that make their chips look less efficient than they really are. Laptop benchmarking has inherited that culture from desktop PC benchmarking to some extent. This is slowly changing, but Apple has never been subject to the same benchmarking pressures in the first place.

You'll see in reviews that Zen5 can be very efficient when operated in the right power range.


Zen5 can be more efficient at lower clockspeeds, but then it loses badly to Apple's chips in raw performance.


> I stacked the deck in AMD's favor using a 3-year-old chip on an older node.

You could just compare the ones that are actually on the same process node:

https://www.notebookcheck.net/R9-7945HX3D-vs-M2-Max_15073_14...

But then you would see an AMD CPU with a lower TDP getting higher benchmark results.

> Why is AMD using 3.6x more power than M1 to get just 32% higher performance while having 17% more cores?

Getting 32% higher performance from 17% more cores implies higher performance per core.

The power measurements that site uses are from the plug, which is highly variable to the point of uselessness because it takes into account every other component the OEM puts into the machine and random other factors like screen brightness, thermal solution and temperature targets (which affects fan speed which affects fan power consumption) etc. If you measure the wall power of a system with a discrete GPU that by itself has a TDP >100W and the system is drawing >100W, this tells you nothing about the efficiency of the CPU.

AMD's CPUs have internal power monitors and configurable power targets. At full load there is very little light between the configured TDP and what they actually use. This is basically required because the CPU has to be able to operate in a system that can't dissipate more heat than that, or one that can't supply more power.

> Meanwhile, if AMD used the same 33w, nobody would buy their chips because performance would be so incredibly bad.

33W is approximately what their mobile CPUs actually use. Also, even lower-configured TDP models exist and they're not that much slower, e.g. the 7840U has a base TDP of 15W vs. 35W for the 7840HS and the difference is a base clock of 3.3GHz instead of 3.8GHz.


> Getting 32% higher performance from 17% more cores implies higher performance per core.

I don't disagree that it is higher perf/core. It is simply MUCH worse perf/watt because they are forced to clock so high to achieve those results.

> The power measurements that site uses are from the plug, which is highly variable to the point of uselessness

They measure the HX370 using 119w with the screen off (using an external monitor). What on that motherboard would be using the remaining 85+W of power?

TDP is a suggestion, not a hard limit. Before thermal throttling, they will often exceed the TDP by a factor of 2x or more.

As to these specific benchmarks, the R9 7945HX3D you linked to used 187w while the M2 Max used 78w for CB R15. As to perf/watt, Cinebench before 2024 wasn't using NEON properly on ARM, but was using Intel's hyper-optimized libraries for x86. You should be looking at benchmarks without such a massive bias.


> I don't disagree that it is higher perf/core. It is simply MUCH worse perf/watt because they are forced to clock so high to achieve those results.

The base clock for that CPU is nominally 2 GHz.

> They measure the HX370 using 119w with the screen off (using an external monitor). What on that motherboard would be using the remaining 85+W of power?

For the Asus ProArt P16 H7606WI? Probably the 115W RTX 4070.

> TDP is a suggestion, not a hard limit. Before thermal throttling, they will often exceed the TDP by a factor of 2x or more.

TDP is not really a suggestion. There are systems that can't dissipate more than a specific amount of heat and producing more than that could fry other components in the system even if the CPU itself isn't over-temperature yet, e.g. because the other components have a lower heat tolerance. There are also systems that can't supply more than a specific amount of power and if the CPU tried to non-trivially exceed that limit the system would crash.

The TDP is, however, configurable, including different values for boost. So if the OEM sets the value to the higher end of the range even though their cooling solution can't handle it, the CPU will start out there and gradually lower its power use as it becomes thermally limited. This is not the same as "TDP is a suggestion", it's just not quite as simple as a single number.

> As to these specific benchmarks, the R9 7945HX3D you linked to used 187w while the M2 Max used 78w for CB R15.

Which is the same site measuring power consumption at the plug on an arbitrary system with arbitrary other components drawing power. Are they even measuring it though the power brick and adding its conversion losses?

These CPUs have internal power meters. Doing it the way they're doing it is meaningless and unnecessary.

> You should be looking at benchmarks without such a massive bias.

Do you have one that compares the same CPUs on some representative set of tests and actually measures the power consumption of the CPU itself? Diligently-conducted benchmarks are unfortunately rare.

Note however that the same link shows the 7945HX3D also ahead in Blender, Geekbench ST and MT, Kraken, Octane, etc. It's consistently faster on the same process, and has a lower TDP.


lmao he’s citing cinebench R15? Which isn’t just ancient but actually emulated on arm, of course.

Really digging through the vaults for that one.

Geekbench 6 is perfectly fine for that stuff. But that still shows apple tieing in MT and beating the pants off x86 in 1T efficiency.

x86 1T boosts being silly is where the real problem comes from. But if they don’t throw 30-35w at a single thread they lose horribly.


> lmao he’s citing cinebench R15?

It's the only one where they measured the power use. I don't get to decide which tests they run. But if their method of measuring power use is going to be meaningless then the associated benchmark result might as well be too, right?

> Geekbench 6 is perfectly fine for that stuff. But that still shows apple tieing in MT and beating the pants off x86 in 1T efficiency.

It shows Apple behind by 8% in ST and 12% in MT with no power measurement for that test at all, but an Apple CPU with a higher TDP. Meanwhile the claim was that AMD hadn't even caught up on the same process, which isn't true.

> x86 1T boosts being silly is where the real problem comes from. But if they don’t throw 30-35w at a single thread they lose horribly.

They don't use 30-35W for a single thread on mobile CPUs. The average for the HX 370 from a set of mostly-threaded benchmarks was 20W when you actually measure the power consumption of the CPU:

https://www.phoronix.com/review/amd-ryzen-ai-9-hx-370/13

On single-threaded tests like PyBench the average was 10W:

https://www.phoronix.com/review/amd-ryzen-ai-9-hx-370/9

34W was the max across all tests, presumably the configured TDP for that system, derived from the tests like compiling LLVM that max out arbitrarily many cores.


Process helps but have you seen benchmarks showing equivalent performance between the same process node? I think it’s less that ARM is amazing than the Apple Silicon team being very good and paired with aggressive optimization throughout the stack but everything I’ve seen suggests they are simply building better chips at their target levels (not server, high power, etc.).


> Our benchmark database shows the Dimensity 9300 scores 2,207 and 7,408 in Geekbench 6.2's single and multi-core tests. A 30% performance improvement implies the Dimensity 9400 would score around 2,869 and and 9,630. Its single-core performance is close to that of the Snapdragon 8 Gen 4 (2,884/8,840) and it understandably takes the lead in multi-core. Both are within spitting distance from the Apple A17 Pro, which scores 2,915 and 7,222 points in the benchmark. Then again, all three chips are said to be manufactured on TSMC's N3 class node, effectively leveling the playing field.

https://www.notebookcheck.net/MediaTek-Dimensity-9400-rumour...


That appears to be an unconfirmed rumor and it’s exciting if true (and there aren’t major caveats on power), but did you notice how they mentioned extra work by ARM? The argument isn’t that Apple is unique, it’s that the performance gaps they’ve shown are more than simply buying premium fab capacity.

That doesn’t mean other designers can’t also do that work, but simply that it’s more than just the process - for example, the M2 shipped on TSMC’s N5P first as an exclusive but when Zen 5 shipped later on the same process it didn’t close the single core performance or perf/watt gap. Some of that is x86 vs. ARM but there isn’t a single, simple factor which can explain this - e.g. Apple carefully tuning the hardware, firmware, OS, compilers, and libraries too undoubtably helps a lot and it’s been a perennial problem for non-Intel vendors on the PC side since so many developers have tuned for Intel first/only for decades.


> for example, the M2 shipped on TSMC’s N5P first as an exclusive but when Zen 5 shipped later on the same process it didn’t close the single core performance or perf/watt gap.

That was Zen 4, but it did close the gap:

https://www.notebookcheck.net/R9-7945HX3D-vs-M2-Max_15073_14...

Single thread performance is higher (so is MT), TDP is slightly lower, Cinebench MT "points per watt" is 5% higher.

We'll get to see it again when the 3nm version of Zen5 is released (the initial ones are 4nm, which is a node Apple didn't use).


Since it's unclear whether Apple has a significant architectural advantage over Qualcomm and MediaTek, I would rather attribute this to relatively poor AMD architectures. Provisionally. At least their GPUs have been behind Nvidia for years. (AMD holding its own against Intel is not surprising given Intel's chip fab problems.)


Yes, to be clear I’d be very happy if MediaTek jumps in with a strong contender since consumers win. It doesn’t look like the Qualcomm chips are performing as well as hoped but I’d wait a bit to see how much tuning helps since Windows ARM was not a major target until now.


I guess getting close to the same single thread score is nice. Unfortunately, since only Apple is shipping it is hard to compare if the others burn the battery to get there.

I suspect the others two, like Apple with the A18 shipping next month, will be using the second gen N3. Apple is expected to be around 3500 on that node.

Needless to say, what will be very interesting is to see the perf/watt of all three on the same node and shipping in actual products where the benchmarks can be put to more useful tests.


Yeah, and GPU tests, since the benchmarks above were only for the CPU.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: