On linux distros, the package manager downloads different binaries based on your CPU. Skylake would be x86-64-v3, Zen 4 would be x86-64-v4, for example.
And there are different schemes for multiple architectures in the same program, like hwcaps.
The extensions can be kinda broken down into 4 levels. Basically ancient, old (SSE 4.2), reasonably new (AVX2, Haswell/Zen 1 and up), and baseline AVX512.
There is discussion of a fifth level. Someone in the Intel Clear Linux IRC said a fifth level wasn't "worth it" for Sapphire Rapids because most of the new AVX512 extensions were not autovectorized by compilers, but that a new level would be needed in the future. Perhaps they were thinking of APX, but couldn't disclose it.
Work out what it would cost to compile - say - a terabyte of C code at typical cloud spot prices.
A large VM with 128 cores can compile the 100 MB Linux kernel source tree in about 30 seconds. So… 200 MB/minute or 12 GB/hour. This would take 80 hours for a terabyte.
A 120 core AMD server is about 50c per hour on Azure (Linux spot pricing).
So… about $40 to compile an entire distro. Not exactly breaking the bank.
you'd have to separate out compiling and linking at a bare minimum to get even a semi accurate model. plus a lot of userspace is c++, which is much, much slower.
in the end it will be like any other modern hardware appliance:
the hardware is the same design for cost saving purposes, but different features are unlocked for $$$ by a software license key.
You want AVX-512? pay up and unlock feature in your CPU and you can now use the feature. This could also enable pay-as-you-go license scheme for CPUs, creating recurring revenue for Intel
from the hardware perspective - the same silicon, but different features sold separately