>A compiler code generator that knows about a hypothetical two divide units (or just a much more efficient single unit) could be much more effective statically scheduling around them.
I'm still completely blind to how they are actually used but GCC and LLVM both have pretty good internal representations of the microarchitecture they are compiling for. If I ever work it out I'll write a blog post about it, but this is an area where GCC and LLVM are both equally impenetrable.
I meant the actual scheduling algorithm - from what I can tell GCC seems to basically use an SM based in order scheduler with the aim of not stalling the decoder. Currently, I'm mostly interested in basic block scheduling rather than trace scheduling or anything of that order.
I'm still completely blind to how they are actually used but GCC and LLVM both have pretty good internal representations of the microarchitecture they are compiling for. If I ever work it out I'll write a blog post about it, but this is an area where GCC and LLVM are both equally impenetrable.