Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

How does that work? The binary format embeds variants of the same program?


Yes, here is an example how it works for GCC.

https://gcc.gnu.org/onlinedocs/gcc-13.1.0/gcc/Function-Multi...


On linux distros, the package manager downloads different binaries based on your CPU. Skylake would be x86-64-v3, Zen 4 would be x86-64-v4, for example.

And there are different schemes for multiple architectures in the same program, like hwcaps.


Isn’t this going to get very unmanageable very soon? Intel seems to add extensions every other year or so.


The extensions can be kinda broken down into 4 levels. Basically ancient, old (SSE 4.2), reasonably new (AVX2, Haswell/Zen 1 and up), and baseline AVX512.

https://developers.redhat.com/blog/2021/01/05/building-red-h...

There is discussion of a fifth level. Someone in the Intel Clear Linux IRC said a fifth level wasn't "worth it" for Sapphire Rapids because most of the new AVX512 extensions were not autovectorized by compilers, but that a new level would be needed in the future. Perhaps they were thinking of APX, but couldn't disclose it.


AVX10/APX does sound like a good baseline for v5.


except that it doesn't support full AVX-512, making the whole idea of backward compatibility between these levels meaningless. "It's Intel!!!"


Well that's an even better justification, as a x86-64-v5 level would be needed for the newer CPUs.

We can throw away any hope of v4 being a standard baseline.


It’s easy to fully automate and storage is relatively cheap these days.


I'd think the issue would be more build infra, every new variant means you have to build the world again


Again, compute is surprisingly cheap these days.

Work out what it would cost to compile - say - a terabyte of C code at typical cloud spot prices.

A large VM with 128 cores can compile the 100 MB Linux kernel source tree in about 30 seconds. So… 200 MB/minute or 12 GB/hour. This would take 80 hours for a terabyte.

A 120 core AMD server is about 50c per hour on Azure (Linux spot pricing).

So… about $40 to compile an entire distro. Not exactly breaking the bank.


you'd have to separate out compiling and linking at a bare minimum to get even a semi accurate model. plus a lot of userspace is c++, which is much, much slower.


Yes. Also, test it.


That can also be largely automated.


LTO does rarely break things in hard to detect ways, but I have never heard of a -march x86 compilation bug.


in the end it will be like any other modern hardware appliance:

the hardware is the same design for cost saving purposes, but different features are unlocked for $$$ by a software license key.

You want AVX-512? pay up and unlock feature in your CPU and you can now use the feature. This could also enable pay-as-you-go license scheme for CPUs, creating recurring revenue for Intel

from the hardware perspective - the same silicon, but different features sold separately




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: