No. The key observation of BOLT is that by collecting profiling on an optimized ...

nsguy · on July 3, 2024

In theory you can get a pretty good idea of where instructions came from in the source code though the optimizer does, shall we say, obfuscate/spread that a little bit (which is why debugging through optimized code or looking at core dumps from optimized code can be tricky- but you can still mostly do it, there's just some lack of precision in the mapping back).

Maybe the bigger problem is at what point do the profiles feed back. Since a compiler may generate many object files which are then linked to form the final binary you'd sort of maybe want to do this in the linker vs. earlier on.

I guess specifically with the kernel there's an extra layer of complexity. It looks like they use `perf` to record the profile which is cool. And then they apply the results to the binary which is also cool.

pizlonator · on July 3, 2024

> In theory you can get a pretty good idea of where instructions came from in the source code though the optimizer does, shall we say, obfuscate/spread that a little bit (which is why debugging through optimized code or looking at core dumps from optimized code can be tricky- but you can still mostly do it, there's just some lack of precision in the mapping back).

I think the whole point of BOLT is that in practice, you can't get a good idea of where instructions came from.

And it's not even about instructions as much as control flow. LLVM, GCC, and other good compilers (like the ones I wrote for JSC) can and absolutely will fuck up the control flow graph for fun and profit. So if the point of the FDO is to create better code layout, then feeding the profiling samples into before when the compiler did its fuckery will put you up shit creek without a paddle: the basic blocks and branches that the profiler will be telling you about don't exist, and won't, until the compiler does its thing.

You could try to run the compiler forward until it recreates the control flow structure that the profiler is talking about, but that seems hella sketchy since at that point you're leaning on the compiler's determinism in a way that would make me (and probably others) uncomfortable. It would rule out running BOLT on binaries optimized with PGO and it would create super nasty requirements for build system maintainers.

muffa · on July 3, 2024

I know very little about compilers or bolt.

But what you just described sounds awesome!(and crazy)