I take it this is a form of profiler guided optimization? Does MSVC already do t...

neerajsi · on Sept 11, 2020

MSVC does have this optimization. It has 3 sections: 'live', 'sick' (referenced, but uncommon), 'dead' (unreferenced in the profile).

jeffbee · on Sept 11, 2020

Yes, this is a pretty old technique. There were papers about hot/cold function splits based on profile data as early as 1996.

caf · on Sept 11, 2020

The technique itself isn't new in clang either. This is about a new implementation of it, where the difference to the existing implementation is that this happens later in the process (it's deferred to the machine-specific code generation phase, whereas the existing implementation happens in the middle-end and is target-agnostic).

vlovich123 · on Sept 11, 2020

The other major piece is that the hot-cold split is more efficient. Rather than thunking out the cold code via a function call it just jumps to the basic block, making it a more efficient approach (no register spilling and function call overhead)

caf · on Sept 11, 2020

The function call overhead itself is irrelevant, because by definition these blocks are cold. The saving/restoring of callee-clobbered registers does affect the code size of the hot function though, so that's important.

labawi · on Sept 11, 2020

Blocks are cold by an imprecise measurement.

Decreasing the downside means you can apply the hopefully-optimisation more aggressively for more gain so I would expect it to matter.

enigmo · on Sept 11, 2020

this was a common practice at Microsoft in the mid 90s, maybe 95-97ish. the set of apps involved were called BBT: "Basic Block Tools". Windows, SQL Server, among others, were post processed with profiling data. it also deduped basic blocks, reducing binary bloat from inlining even without profiling data. just needed some additional info in the debug symbols to work.

jeffbee · on Sept 11, 2020

Identical code folding still breaks debugging with modern llvm. It can make it seem like the call came from an impossible place in a stack sample. Is it something that Microsoft solved long ago?

drivebycomment · on Sept 11, 2020

Yes. And it was implemented relatively widely e.g. https://docs.oracle.com/cd/E19205-01/820-4379/index.html which IIRC was published before 2010.

DSingularity · on Sept 10, 2020

Yes it is, not sure what MSVC does.