Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
M3 Max Geekbench (geekbench.com)
72 points by tambourine_man on Nov 2, 2023 | hide | past | favorite | 77 comments


M3 performance so far on Geekbench: all M3 seems to have the same single-core score around 3000, which is 10% higher than M2 Max, same as i9-13900KS.

In terms of multi-core

- M3 (8-core): 11500, 10% higher than M2, in the middle of M2 and M2 Pro 10-core

- M3 Max (16-core): 20600, 50% higher than M2 Max (12-core), same as i9-13900KS

For comparison, the Snapdragon X Elite (at 23W) announced recently scores 2800 in single-core and 14000 in multi-core, same as the M2 Max.

In terms of graphics (OpenCL):

- M3: 30000, 10% higher than M2, same as AMD Radeon 780M (currently best iGPU in x86 land)

- M3 Max: 94000, 10% higher than M2 Max, a bit above the RTX 2070.


I wouldn't use the OpenCL score, OpenCL is a second class citizen that has been deprecated 5 or more years ago.

It still works, but as you can see in Geekbench, Metal get a much higher score running the same workload.


Can you run Metal on an NVidia GPU? If not how to compare at all?



> Metal get a much higher score running the same workload.

Are you sure those scores are comparable?


If it's doing the same work on the GPU, yes.


The comparisons are actually exciting me more than the M3 itself. Yay, more choice!

However I guess you can crank any core up to any performance if you're willing to throw power at it - performance/watt seems to be the by far most interesting metric. I suppose it's certain that the i9 can't keep up at that - but maybe the Qualcomm chip can? If it's 30% or so worse it would actually still be quite good ...


So a bit faster than 7950x in eco mode (which has a much lower tdp), and the 7950x process is two generations old. It's clear that Apple's advantage is mostly just one thing: "TSMC".


A laptop that can match the best of the competition's desktop is pretty big advantage IMHO.

Don't forget that this M3 Max chip is measured inside a laptop that can run at the same performance on battery. Apple's desktop systems already match the greatest offerings of the competition with it's previous gen chip and with the constraints on size and cooling system noise.

I wonder what Apple can do in its labs when pushing their silicons to the limits without constraints on cooling. Every now and then a fringe geekbench score will pop-up, maybe from those lab experiments.


> A laptop that can match the best of the competition's desktop is pretty big advantage IMHO.

7945HX3D (a 5nm 55W laptop part) scores 15-16k on GB6. M3 Max 3nm is just a little better. Almost all of which can be attributed to TSMC 3nm.


Sure, if you define "a bit" to be %30 then you can say that.

Not everyone will agree with the "a bit" definition though, how many years took AMD to hit a %30 improvement?

Also, the high performance laptops with that chip appear to be %50 heavier than 16" Macbook Pro. According to reviews, those laptops can do 4 hours of general usage(not gaming) at best - so terrible battery life. Also, significant drop in performance when unplugged.

Overall, those machines are optimised for one thing only and apparently they can do it about %30 worse than a Macbook and everything else much, much worse.


For how long can the macbook do the 30% more? Because in my experience as someone who renders animations for multiple days or even weeks on machines, the newer Apple designs are good for burst loads, but don't shine when it comes to continous loads where the thermals start to kick in.


> Sure, if you define "a bit" to be %30 then you can say that.

30% is roughly the difference to be expected between N5 -> N5P/N4 -> N3. That was the original argument - "TSMC".

About laptop battery life etc, that's a different discussion.


> 30% is roughly the difference to be expected between N5 -> N5P/N4 -> N3. That was the original argument - "TSMC".

Not true according to TSMC’s own messaging. Keep in mind that predicted power and performance improvements keep the other factor constant.

https://www.anandtech.com/show/18833/tsmc-details-3nm-evolut...


In theory at least. When rendering a blender animation my i7/rtx2040 notebook outperformed the back then strongest M1/Metal laptop easily by a factor of 2.

My suspicion is that during long workloads (more than a hour max load) something like a macbook tapers off into thermal throttle. In the studio we have a bunch of m2 Mac Minis, they are good for the price, but also with them I am unsure about thermals under continous load.


Not a hard achievement if you are focusing on non-3D chips. M3 and AMD's X3D chips have way more cache to CPU. This really livens up the game. It's so impressive that AMD's mobile variant of the X3D chips pretty much almost on par. It has the advantage of faster single core perf though.

https://browser.geekbench.com/search?utf8=%E2%9C%93&q=7945HX...


The M3 Max scores higher on GB6 than the 7950X, full stop:

https://browser.geekbench.com/processors/amd-ryzen-9-7950x

I'm also not sure what you mean by 7950x having a lower TDP than the M3 Max.

Even in eco mode, the 7950x has a TDP of 65W, vs the M3 Max's 45W:

https://www.anandtech.com/show/17585/amd-zen-4-ryzen-9-7950x...


> The M3 Max scores higher on GB6 than the 7950X, full stop: https://browser.geekbench.com/processors/amd-ryzen-9-7950x

As you can see in this list [1], there are several systems which exceed 21k. The 19k scores are mostly stock systems with no tuning (for example, RAM at 3200Mhz, stock fan etc). The Macs would have been carefully tuned, so it's only fair that you'd do so on assembled kits.

I haven't tested Eco mode myself, but from what i've read performance drops by 10-20% depending on the configuration (105w vs 65w). Still in the ballpark.

Add: and 7950X3D would do even better.

[1]: https://browser.geekbench.com/search?q=7950x


Yes, if you overclock the 7590x, ignoring any "system tuning" you can get a better GB6. If we could overclock the M3 Max, we could achieve the same.

The performance loss of the 7590X in 65W eco mode isn't 10-20% compared to stock. It's more like 20-30%:

https://www.youtube.com/watch?v=W6aKQ-eBFk0

If we use an overlocked 7590X – ignored the additional power draw – in 65W eco mode, using a rounded up 22K GB6 score, and assume a 20% performance loss (~17600 GB6), the 7950X, at 40% higher TDP than the M3 Max, is still 10% slower than the M3 Max.

Which is especially insane considering the 7950X is a desktop-class CPU, and the M3 Max is sitting in a laptop.


Since we agree on the non-linear relationship between TDP and benchmark scores, can we also agree that further reduction in power (to say 45w) would only cause a small drop in performance? For argument's sake, let's triple the 10% to 30% and also assume that TDP = avg power draw during the benchmark. The 5nm 7950x is then 30% slower than 3nm M3 Max.

My original argument was that Apple's advantage is mostly just TSMC.

> Which is especially insane considering the 7950X is a desktop-class CPU, and the M3 Max is sitting in a laptop.

7945HX3D is a 5nm 55W laptop part, and it scores 15-16k on GB6.


You haven't proven your argument then, and it shouldn't be a hard one to prove, as the 7940 and Apple's M2 Max both exist.

AMD's 7940 used TSMC's 4nm FinFET while the M2 used their older 5nm process.

And yet:

https://browser.geekbench.com/search?q=7940hs https://browser.geekbench.com/search?q=M2+Max

So, even when AMD has a process advantage over Apple (however slight), Apple still wins.


1) M2 uses N5P. N4 has little advantage over N5P.

2) 7950HS is 35-54w. You'll see several results above/around 14k.


1. Little, or none, and it's as close as a comparison as possible, and why it'd be useful to prove your "It's all TSMC!" point, yet fails to.

2. … and there are M2 Max results above/around 15k.


> Since we agree on the non-linear relationship between TDP and benchmark scores, can we also agree that further reduction in power (to say 45w) would only cause a small drop in performance?

You're confusing actual power with the system's configured power limit. When the default power limit is higher than the actual power draw during a given workload, then you obviously have headroom to lower that power limit quite a bit without severely reducing performance. That doesn't mean further reductions in the power limit will have similar impact, once you're working in the range where the power limit actually starts to kick in. And as you get to even lower power limits, a smaller fraction of that power budget is available for doing useful work as subsystems like the memory controller cannot reduce their power consumption as readily as the CPU cores.


Next year is going to be SEXY!


https://browser.geekbench.com/search?utf8=%E2%9C%93&q=7945HX...

Check this out, this processor is a 55-64 watt variant of the 7900X3D for mobile use.

It's neck and neck.


> It's neck and neck.

It’s clearly not.

The M3 Max SoC:

1. Uses half the power (unclear whether this includes GPU power usage)

2. Has +10% single core performance

3. Has +30% multi core performance

[1] https://browser.geekbench.com/v6/cpu/compare/3443230?baselin...


The 7945HX3D is a mobile version of the 7950X3D (not 7900X3D), but the only way to consider it a 64W CPU is to ignore the power used by the IO die—and a CPU isn't much use without a memory controller.

Don't mistake a long-term sustained power limit that OEMs can freely adjust for an actual power consumption measurement, especially when discussing a benchmark that only does short bursts of work.


I don't know that'd call a 15-20% performance difference "neck and neck".


7945HX3D is a M2 competitor though.

8K series is around the corner now.


Yes, but not a m2 pro, m2 max, or m2 ultra. Or a M3 pro or M3 max.

Sometime in early 2024 AMD is supposed to release a new APU (CPU+iGPU) called strix. Doesn't seem particularly noteworthy, but mid to late 2024 AMD is going to bring out a chip called strix halo that FINALLY brings more than 128 bit memory system to a APU.

It baffles me that despite a huge GPU shortage that lasted years and shipping a huge number of XboxX and PS5 with nice memory systems that they didn't bother to ship a decent APU and a decent memory system for the desktop.

At least the halo strix should give the M3 pro a run for it's money, still half the M3 max and 1/4th of the M2 ultra.


My guess is that the market wasn't ready at the time, now high consumer demand for handheld PCs.


> 8K series is around the corner now.

If they were going to announce Zen 5 high-end mobile parts at CES in January 2024, they would have launched Zen 5 desktop parts by now (because the high-end -HX mobile parts are literally the desktop silicon put into a BGA package instead of LGA). A successor to the 7945HX3D can't be much less than a year away, meaning the 7945HX3D is less than halfway through its product lifecycle.


Actually their standard mobile lineup comes first. Then desktop or if the gains are just a small step they skip and go to premium laptop. So it goes standard laptop -> desktop -> premium laptop.

That is how AMD has released chips for the past six years.


Don't look at the model numbers, look at the architectures.

Zen 1 desktop processors (branded Ryzen 1000 series) were released spring 2017; Zen 1 mobile processors (branded Ryzen 2000 series) were released starting in fall 2017. Zen+ desktop processors (branded Ryzen 2000 series) were released spring 2018; Zen+ mobile processors (branded Ryzen 3000 series) were released at the beginning of 2019. Zen 2 desktop (branded Ryzen 3000 series) were released mid 2019, followed by Zen 2 mobile (branded Ryzen 4000 series) in spring 2020.

For Zen 3 desktop, they skipped 4000 series branding to catch up with the mobile branding: Zen 3 destkop (branded Ryzen 5000 series) launched late 2020, followed by Zen 3 mobile (branded Ryzen 5000 series) at the beginning of 2021. Zen 3+ (branded Ryzen 6000 series) was a mobile-only update to Zen3 (same CPU microarchitecture, minor die shrink, new memory controller) launched at the beginning of 2022. Zen 4 desktop (branded Ryzen 7000 series) launched fall 2022, followed by Zen 4 mobile (branded Ryzen 7000 series) at the beginning of 2023.

Their new architectures launch on desktop and server first, using the same CPU chiplets in both segments. The monolithic mobile processors come later. But every year, they increment the model numbers of their mobile parts whether or not they have a new architecture, and the mobile parts are almost always announced at CES in January; that's simply how the laptop market functions.

Zen 5 desktop and server parts aren't here yet, so whatever 8000 series mobile parts they introduce at CES in January 2024 either won't be using Zen 5, or they'll be announcing at the beginning of the year but not shipping until fall at the earliest. Recent rumors suggest that their high-end monolithic mobile chip (a new product segment for them) has been delayed from late 2024 to early 2025.


Sometimes they do launches at and during CES so definitely keep an eye. The biggest benefit is that it prevents companies from course correcting their design if they announce and release at the same time. Intel gets rug pulled. Which seems to be the strategy for the last few years.

Also if the 8k is launched it is usually a small limited run of their basic chips. Desktop chips will deifnitely be later in the year.


> Sometimes they do launches at and during CES so definitely keep an eye.

I get the feeling you didn't read my comment all that carefully.


> 7945HX3D is a M2 competitor though.

Right now it’s competing with M3.


Don't forget, they also have the on-die RAM that cannot be upgraded and you can't get any more than what Apple offers, at exorbitant prices.


True, but then again you get 400GB/sec of bandwidth and low power use that allows for 10-20 hours of usage in a tiny thin laptop.

Sadly 7950x doesn't come close to fitting in laptops, runs hot (even in the non-x flavor) and has 1/4th the memory bandwidth.

I'd happily trade dimm slots for a fraction of the power usage and 4x the memory bandwidth (or 8x if you get the studio).


That's where the 7945X3D comes in. :)


Does cache help some things sure, but not a replacement for bandwidth. I have seen cases where the zen 3 X3D wins against the zen 4 without X3D, especially on simulations, emulators, flight sims, etc.

Sadly there seems to be movements all over the industry to depend on caching over bandwidth and generally they win benchmarks, but often lose on real world use. Intel N100 and similar embedded type chips for appliances/routers went from 128 bit wide to 64. The M2 Pro has 256 bit wide memory and the M3 pro went to 192 bit wide. The Nvidia 3060 Ti 256 bit wide memory has 4060 Ti went down to 128 bit wide.

Sure the average performance is often higher (when cache friendly), but the performance is much more variable as the ratio between in cache performance and out of cache performance gets larger. GPUs with more cache and less bandwidth tend to get pickier about which games they run well on and the 1% lows get slower, which makers stuttering worse.

It's sad that from that for normal desktops from a few $100 to a few $1000 all have 128 bit wide memory interfaces, unless you buy a mac and you can get 128, 256, 512, or 1024 bit wide.


This is why I'm excited for the 8K chips we will have some actually comparable chips with wider bandwidths to put side by side to M series processors.


> True, but then again you get 400GB/sec of bandwidth

For what, other than toy AI inference? More HEDT users would go with more RAM at DDR5 speeds than 400GB/sec of bandwidth. As you can see from Geekbench, the additional bandwidth has no real world implications for non-GPU use cases.

But again, just as with everything else it depends.


> For what, other than toy AI inference?

Processing multiple video or audio streams. Key use case for Macs.


Simulations (games or HPC), emulation, and editing multiple streams of 8k videos.

Why do you say "toy AI inference"? Getting 800GB/sec to 128GB of memory (on a desktop) or 400GB/sec to 128GB/sec on a laptop is hard to beat. You have to spend crazy money to get a GPU with that much ram and normal desktops like the mentioned 7950x are going to be 4-8 x slower.

I've seen people get over 5 tokens/sec inference with 180B models, not what I'd call "toy AI inference".


Most intel processors with dual channel will easily get 90GB/sec of bandwidth just for the CPU; if you build a performance-oriented machine you are supposed to put a dedicated GPU with that. The Apple silicon bandwidth is nothing special considering they must share it with a GPU. Dedicated GPU have as much or more bandwidth than Apple Silicon; and better than that, at the low end they are highly likely to have more RAM available. When you compare prices, even on laptops, it is likely that the competing windows PC will have both more bandwidth and more RAM available. Because 8GB of RAM in a dedicated GPU is not very rare at the price MacBook Pros are sold at, and those systems have loads of CPU RAM on top of it. Even with the update, a base M3 Pro only has 18GB of ram to share between CPU and GPU. If you let the GPU use 8GB like it would be able to in the windows laptop, suddenly you only have 10GB of system RAM. Which is going to be very limiting for many things.

In the end, every other OEM just gave up on bandwidth because it does not matter for most things. Even Apple somewhat acknowledged that by lowering the bandwidth available for most SKUs; so, there is that.

And stop quoting the higher-end bandwidth that is only available in extremely expensive SKUs : - 800GB/s is only available in the Mac Studio Ultra version and the cost equivalent would be a PC with two Nvidia 4090 that would completely crush it both bandwidth and speed wise. - 400GB/s is only available in at least 2K Mac Studio desktop or minimum 3K laptop. And it is nothing special compared to the available bandwidth dedicated GPUs have at this price.

The bandwidth on the lower SKUs is whatever because it is barely better than what Intel has always provided in their CPUs with integrated graphics. It is a bit better but nothing special and there exist competing products with just as much bandwidth, not that this fact is particularly relevant for the type of tasks this low/mid-range hardware is supposed to carry out.

Apple fanboys are so delusional its bordering insanity.


> they also have the on-die RAM

On-package, not on-die. I know it's a technicality but it's an important one.


The base model at 8GB and the RAM price is really the upsetting part to me.

There is no reason for it to be 900% of what it could be.


basically it's just that they have access to the latest node before anyone else and not much more.


Unified memory is a huge advantage for the M series.

Also for end users the dedicated hardware for specific use cases e.g. ProRes is a major differentiator.


Pretty much. And the "advantage" does not even matter in the end since it cannot run anything worthwhile with those speeds. The current Apple marketing material is hilariously old/bad ports of software that really are not the leader in their fields but also generally just work better/faster on a windows PC (sometimes even on Linux).

Which is why the quoted battery life is complete bullshit for Apple fanboys who never go out of the walled garden. If you start using the software other people use and need the battery life becomes a lot less impressive, because those software actually makes use of the hardware.

It was ok when Apple wasn't too bad of a deal from a hardware pricing standpoint (considering build quality and potential longevity thanks to ease of repair/upgrade) but now it is plain stupid.

As far as im concerned, unless you are developing for Apple platforms or really have to use Logic/Final Cut; it doesn't make sense to invest in an Apple computer at the moment, considering the price. They just don't offer anything worth the pricing delta no matter what the marketing and fanboys say.

Entry level machines are barely relevant, but with the upgrade pricing on RAM/storage, high end machines just don't matter in the grand scheme of things. I could buy 2 fully loaded PCs for one high end Mac Studio, there is no way it can make up the productivity for that difference, no matter how little power it consumes (power that is cheap, and conveniently only gets used when actual work is being done...)


Looks like it’s on par[1] with a top of line Core i9-13900KS but at a fraction of the wattage.

I’d never buy one, but I’d love to see Apple do a Mac Pro build at a watt/heat equivalent Apple Silicon build.

1. https://browser.geekbench.com/processor-benchmarks


And OpenCL ("GeekBench 6 Compute") scores comparable with an RTX 2070:

https://browser.geekbench.com/opencl-benchmarks


doing a few sketchy conversions puts Metal compute performance closer in capability to an RTX 4070.

M3 Max is ~170% M1 Max per some graphic Apple released. That puts it at ~95% the performance of a 6800 XT on the Metal charts. RTX 4070 is ~95% the performance of the 6800 XT on the Vulkan charts.


That's not bad actually because bottleneck in training is in RAM and we can have 128gb...


If you’re training a model that needs 128gb, the GPU speed is going to be a “bottleneck”


I believe that Apple Silicone is still too slow for training. The big RAM is really useful for interference of big models though



Does this mean it's actually as fast in real-world scenarios? I'm not that familiar with Geekbench


Intel hasn't even been on point for a while.


M3 Max 3151 / 20463

7950X3D 3173 / 22341

As a developer my Go builds take ~5 seconds. I invested in very fast NVMes and coming from ~5 minute Java builds this is great. Though what I want is <1sec builds.

What I want is AMD/Intel moving to RISCV without the 50 years X86 baggage and the 30 years of ARM baggage.

Give me cache on the dies instead of legacy transistors, give me faster NVMe directly to memory.


The movie hackers promised this in 1996 and we are still waiting.


That was a RISC chip, as opposed to CISC, not RISC-V.


You can find other M3 models by searching for "mac15":

https://browser.geekbench.com/search?q=mac15


OT: Yesterday, an Apple forum thread [1] complaining about the performance of the current M3 chips vs older generations got to the homepage of HN [2].

It was flagged and killed, I assume because it was somehow misleading.

As some who doesn't understand these issues, could some of the knowledgeable fellows here ELI5 why that thread is junk, how current GPU's compare to those from Intel's day, and how important is the M3 GPU for LLM related to tasks.

And maybe also - I see that the M3 is lower than M2 in.. "P-cores"? I don't understand what this means? What is the case where the M2 is better than the M3?

[1]: https://developer.apple.com/forums/thread/693678 [2]: id=38094895, not posting link as am not trying to revive it


Roughly 20% faster than the original M1 Max, which scored ~2400 on single core.

As mentioned elsewhere it’s 10% faster than the M2 Max. This is really an impressive step up each year compared to what came before. If Apple continues this trend I’ll probably upgrade every 2 years.


Hmm, I was of the opposite impression. Not much of an upgrade. The big change was going to Mx, mainly in battery time / power usage. But no big need to upgrade if you have M1, for instance.


Anyone have insight as to whether these will be useful for ML tasks - mostly on pre-trained models?

I have bulk jobs that I like to do for things like transcription and I'd like to add summarization as well (Whisper + Llama basically). My GTX 1070 is fine for proof-of-concept, but I was planning to build a 4090 box for this stuff.

Nvidia is notorious for nerfing the RTX cards for ML tasks on RAM specifically. Is there any universe in which it makes sense to put my 4090 box budget ($4000+ all in) towards an updated MacBook? I can get a laptop with 128Gb RAM for $4600 and the convenience of not having to run things remote and deal with another OS would be a big win.


If you want to run ML tasks, a 4090 will smoke any laptop plain and simple. You get to decide about convenience of having a laptop. I wouldn’t want to deal with the inconvenience of a non-nvidia stack.


4090 has 32GB of available memory versus up to 128GB on the Mac.

That allows for some training to be slow but at least possible. When you look at benchmarks the memory can make a difference especially for LLMs:

https://www.reddit.com/r/LocalLLaMA/comments/16o4ka8/running...


Huh? The 4090 has 24GB of RAM which is independent of the system RAM.


Thats not true. RTX 4090s have 24GB.


Have you tried running remote via parsec (if you want the full desktop experience)? With proper internet connection you won't realize you are on a remote connection.


I will make a note to try Parsec. I've been using RDP with both devices connected to the same LAN over ethernet. The experience is very good, the biggest annoyance is that I have to deal with Windows.


Seems like the winning formula is to have some performance cores and some efficiency cores, but not too many and only 2 kinds. Meanwhile Qualcomm and Snapdragon mobile processors have 3-4 different kinds of cores and a third more of them in total. No wonder android phones with much larger batteries are getting so much worse battery life. And what’s funny is they end up being slower despite all that because their performance cores are forced to be clocked lower to make up for the higher power density, and most things on a phone won’t take advantage of those extra cores.





Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: