> Arm is generally more efficient than x86 This is not entirely true in general ...

Vogtinator · on Sept 26, 2023

> This is not entirely true in general sense. Yes, a typical ARM CPU is more energy efficient indeed, but theoretically nothing prevents x86 to be nearly as efficient.

The very complex instruction set does. You can easily throw multiple decoders at Arm code, but x86 scales badly due to the variable length. Current cores need predecoders to find instruction boundaries which is just not needed with fixed width instructions and even then can only decode simpler instructions with the higher numbered decoders.

ernst_klim · on Sept 26, 2023

> Current cores need predecoders to find instruction boundaries which is just not needed with fixed width instructions

The question is, how much overhead does it cause compared to the whole picture. There are empirical evidences the answer is "very little":

https://chipsandcheese.com/2021/07/13/arm-or-x86-isa-doesnt-...

> With the op cache disabled via an undocumented MSR, we found that Zen 2’s fetch and decode path consumes around 4-10% more core power, or 0.5-6% more package power than the op cache path. In practice, the decoders will consume an even lower fraction of core or package power.

mathisfun123 · on Sept 26, 2023

> The very complex instruction set does.

i.e., PSPACE ⊆ EXPTIME

https://en.wikipedia.org/wiki/EXPTIME

which is funny because people are always like "uh why do i need to understand asymptotics when machines are so fast". well the answer is the asymptotics catch up to you when the speed of light isn't infinite or when you're timing things down to the nanosecond.

AshamedCaptain · on Sept 26, 2023

Arm is practically as complex as x86... It supports multiple varieties (e.g. v7, thumb, thumb2, jazelle, v8, etc), lots of historical mistakes, absurdly complex instructions even in the core set (ltm/stm), and a legacy that is almost as long as the x86. It even has variable length instructions too...

pjmlp · on Sept 27, 2023

Many of which were dropped for 64bits ARM.

AshamedCaptain · on Sept 27, 2023

Only jazelle and thumb v1 are dropped from most v8 non-ulp cores, and then only half dropped: they still consume decoding resources (e.g. jazelle mode is actually supported and the processor will parse jvm opcodes, just all of them will interrupt). We are stuck with the rest as much as intel is stuck with the 8087: It is about time they could do some culling, but not without backlash.

pjmlp · on Sept 28, 2023

I stand corrected, thanks.

JonChesterfield · on Sept 26, 2023

I'm not sure this holds. X64 decodes instructions (which is awkward) and stores the result in a cache, then interprets the opcodes from that cache. So the decoding cost only happens on a cache miss, and a cache miss on a deeply pipelined CPU is roughly game over for performance anyway.