Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Arm is generally more efficient than x86

This is not entirely true in general sense. Yes, a typical ARM CPU is more energy efficient indeed, but theoretically nothing prevents x86 to be nearly as efficient.

The main reason why Apple silicon is more efficient is that Apple silicon is a mobile chip basically, and competition on mobile is harsh, so all the producers had to optimize their chips a lot for energy efficiency.

On the other hand until apple silicon and recent AMD ascension there was a monopoly of Intel on a laptop market with no incentive to do something. Just look at how fast Intel developed asymmetric Arm-like P/N-core architecture right after Apple Silicon emerged. Let's hope this new competitor will force more energy efficient x86 chips to be produced by intel and amd eventually.



> This is not entirely true in general sense. Yes, a typical ARM CPU is more energy efficient indeed, but theoretically nothing prevents x86 to be nearly as efficient.

The very complex instruction set does. You can easily throw multiple decoders at Arm code, but x86 scales badly due to the variable length. Current cores need predecoders to find instruction boundaries which is just not needed with fixed width instructions and even then can only decode simpler instructions with the higher numbered decoders.


> Current cores need predecoders to find instruction boundaries which is just not needed with fixed width instructions

The question is, how much overhead does it cause compared to the whole picture. There are empirical evidences the answer is "very little":

https://chipsandcheese.com/2021/07/13/arm-or-x86-isa-doesnt-...

> With the op cache disabled via an undocumented MSR, we found that Zen 2’s fetch and decode path consumes around 4-10% more core power, or 0.5-6% more package power than the op cache path. In practice, the decoders will consume an even lower fraction of core or package power.


> The very complex instruction set does.

i.e., PSPACE ⊆ EXPTIME

https://en.wikipedia.org/wiki/EXPTIME

which is funny because people are always like "uh why do i need to understand asymptotics when machines are so fast". well the answer is the asymptotics catch up to you when the speed of light isn't infinite or when you're timing things down to the nanosecond.


Arm is practically as complex as x86... It supports multiple varieties (e.g. v7, thumb, thumb2, jazelle, v8, etc), lots of historical mistakes, absurdly complex instructions even in the core set (ltm/stm), and a legacy that is almost as long as the x86. It even has variable length instructions too...


Many of which were dropped for 64bits ARM.


Only jazelle and thumb v1 are dropped from most v8 non-ulp cores, and then only half dropped: they still consume decoding resources (e.g. jazelle mode is actually supported and the processor will parse jvm opcodes, just all of them will interrupt). We are stuck with the rest as much as intel is stuck with the 8087: It is about time they could do some culling, but not without backlash.


I stand corrected, thanks.


I'm not sure this holds. X64 decodes instructions (which is awkward) and stores the result in a cache, then interprets the opcodes from that cache. So the decoding cost only happens on a cache miss, and a cache miss on a deeply pipelined CPU is roughly game over for performance anyway.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: