Branch predictors have gotten really good and it often now makes more sense to rely on it rather than working away the branches.
For example, modern compilers will very rarely introduce conditional moves (cmov) x86 because they are nearly always slower than simply branching. It might be counter intuitive, but a branch prediction breaks the dependencies of the micro-ops between the conditional and the clause. So if your cmov's conditional depends on a load, you need to wait for that load complete before it can execute.
> For example, modern compilers will very rarely introduce conditional moves
For conditionally-selected data that lives in registers (and occasionally, on the stack), GCC seems to always use cmov (as it is much cheaper than a branch with possibly p=0.5 after all)
You do have a very good point about data dependencies.
- for "c ? a : b" (where a, b, c are func args), all 4 versions use their version of cmov
- for "c ? *a : *b", x64 version uses cmov on the address whereas Aarch64 uses a "full" branch
- Aarch32 always use conditional instructions in these 2 expressions, and additionally, "a * (b & 1)" gets optimized into "a & ((b & 1) ? ~0 : (b & 1) /* = 0 */)"