Well in this case it seems reasonable that they simply added a rule for the spec...

missblit · on Aug 17, 2021

> Of course the real goal is to eventually get it to optimize `while (number != 1) { ... }`

Valid C++ programs are guaranteed to be able to make forward progress [1].

So if `number` is a normal stack allocated integer (and not volatile, etc), then the infinite looping case is undefined behavior here.

So it would be a valid optimization to transform this to `number = 1;`.

(And indeed: https://godbolt.org/z/eodhfWe6h )

[1] https://en.cppreference.com/w/cpp/language/memory_model

secondcoming · on Aug 17, 2021

Amazing, but crazy.

icc does not make this astounding leap in logic.

jcranmer · on Aug 18, 2021

This is an example of a transformation that is probably more expensive in compile-time than it actually wins in runtime performance. Gcc has explicitly rejected this kind of optimization in the past for precisely that reason.

In general, compilers face an iron triangle of optimization: do you want a fast compiler, a small binary, or fast output? Optimizations that help all three are rare (especially ones that aren't already implemented in -O1), and even two of the three is often difficult to obtain.

account42 · on Aug 18, 2021

> This is an example of a transformation that is probably more expensive in compile-time

Why? Reasoning that the value must be 1 after the loop is not even a loop optimization, just a simple consequence of the branching instruction. This leaves the loop but nothing it computes is used anymore so since C++ guarantees that it exits you can simply remove it.

The transformation as a whole might not be useful often, but having a constant for the induction variable after the loop and being able to delete useless loops will both have runtime wins.

> In general, compilers face an iron triangle of optimization: do you want a fast compiler, a small binary, or fast output?

I often wish compilers provided more options to just brute force a bit more performance out of release builds. You can manually tune many parameters but then you need to know a lot more about how the passes work than just saying --compile-for=5h (or something like that, doesn't need to have a hard time limit).

matttb · on Aug 18, 2021

Note Gcc does do this, just at -O2 and above, whereas Clang does it on -O1 and above. Gcc also only does it on C++ code whereas Clang does it on both C and C++

onedognight · on Aug 17, 2021

Maybe LLVM should add a Collatz[0] optimization pass that would replace your loop with

    number = 1

for at least up to 64bit types?

[0] https://en.m.wikipedia.org/wiki/Collatz_conjecture