IMHO the best way to think of it (well, it's how I have long thought of it) is two lemmas:
1 - The compiler has more "fingers" than a human does. Back when I wrote programs in assembly I would use printout to keep track of what I was doing and for debugging, and so would often put a finger on the paper to mark where a jump was and then go to the destination to see if it matched up, etc. This process isn't at all scalable, so the compiler is inevitably going to do better than I ever could on anything more than something trivial.
2 - I know more about the program constraints than I could ever express (much less be bothered expressing) in a HLL. "All the values will be less than 16" or "I can use this memory location for something else after I've read from it" or who knows what. So sometimes I can chop stuff out or even rewrite the compiler output locally to speed it up.
Also I can only do this for a few architectures...and with modern x86 there are so many variants with all sorts of microoptimization affordances that I can rarely beat the compiler (this is a corollary of lemma 1)
A compiler is a combinatorial optimizer (think bin-packing). In general, optimizers/solvers basically search for the best solution. Most production compilers don't have solvers in them, they use heuristics instead, but even the best solvers use tons of heuristics. Naturally a computer will search/try heuristics faster and more thoroughly than you but sometimes you can do better because performant searching is all about "knowing where to look".
Modern compilers are not doing much searching in general. It's mostly apply some feed-forward heuristic to determine whether to apply a transformation or not.
I think a slower, search based compiler could have a lot of potential for the hottest parts you're willing to spend exorbitant time on a search.
I have heard search based compiler optimization called "superoptimization"[1]. It seems interesting, but as far as I know has not seen much industrial use.
It scales well enough. You can apparently run Souper on SQLite in 24 hours with a beefy machine, according to a talk I recently attended, by one of the developers.
That’s a cool data point, thanks! Though mind that it is a superoptimizer on LLVM bitcode, not on machine code itself, avoiding all the combinatorial explosion of register allocations and probably a bunch of more, but I don’t know enough about the program.
> Modern compilers are not doing much searching in general.
This is false. Any compiler that does register allocation and instruction scheduling (all of them) is searching for an optimal (or just good enough) solution to an optimization problem.
I'm not sure there is a clear separation between applying heuristics and searching a space. Often in compilers you search a subset of a space using heuristics, and you can adjust those to control how much of the space you cover.
For example, here is a pass that reorders WebAssembly globals in the Binaryen optimizer:
We have a simple criteria for the quality of a solution - how big the binary size is with an order - but the space of possible orders is huge (every permutation that keeps every global after its dependencies). What we do is a targeted search of that space using some heuristics using parameters that work well enough and aren't too slow in practice.
> I'm not sure there is a clear separation between applying heuristics
There is and it's quite simple: if your heuristic reduces the size of your search space faster than it takes to perform the search (ie try solutions) then you have a real algo on your hands. Otherwise you're just searching. This is basically the border between P and NP and it's just that in compilers most of the problems are NP hard so none of the heuristics are really that good.
To me an algorithm is closed, while a heuristic (aka rule of thumb) is just a fast way to probably get a better solution / subset of the solution space at the cost of possibly missing the optimal result or even ending up in a pessimal corner case.
With an NP complete problem you'd rather have some solution rather than use up your lifetime searching for the best.
i dunno what "closed" means here? converges? lots of things with heuristics converge...
most things people think of as algorithms are just heuristics codified. take for example unit propagation in DPLL (fairly modern SAT approach); quoting wiki[1]
> In practice, this often leads to deterministic cascades of units, thus avoiding a large part of the naive search space.
that *in practice* there means there's no guarantee because ofc not - otherwise they would've proven something about NP. but they didn't, they just came up with a way to search that's sometimes/often not bad. a lot of people call this an algorithm ("... is a refinement of the earlier Davis–Putnam algorithm") because it fits the dictionary definiton (a repeatable process) but i do not because it's a repeatable process that isn't proven to produce anything (ie faster than brute force). and the intuition i'm proposing for why it doesn't is because it doesn't actually shrink the search space fast enough.
note, my definitions/intuitions don't translate/have the same force elsewhere (especially in continuous/differentiable spaces) but they're a pretty standard/obvious perspective in combinatorial optimization (np-hard/np-complete problems).
I don't know what you're asking for - this isn't some kind of controversial topic - any iterative algo that isn't polynomial time (or is approximate) is search.
In the context of compilers there are many. Look at this block diagram for Chaitin's register allocator:
I am going to be the token programming language researcher and say that what you really want is a dependently typed assembly language that your dependently typed higher level language lowers to. One school of thought that has yet to bear fruit in “mainstream“ programming, but gives tantalizing hints of what is possible, is expressing increasing amounts of your programs constraints in the type system, thereby informing the compiler precisely about your intent.
Doesn't make any sense. The "type constraints" on assembly operands (registers and numbers) is the ISA and thus those constraints are combinatorial not logical
You misunderstand. If I, at the assembly level, could express things like "this register ranges from 0 to 4"—I might be able to provide ever-lower levels of the stack with more information about my intent. Maybe the ISA can now do something clever to further optimize my program. Now, there will likely always be a time when all types are erased, but if we can push that erasure lower and lower in the stack, then compilers have more and more information about what we actually want them to do, provided our type systems are rich enough to express that intent.
that's not a type system, that's an ILP. and we already have that in many compilers (bindings to ILP solvers).
> Maybe the ISA can now do something clever to further optimize my program.
this is a malformed sentence - the ISA is fixed and isn't making any gametime decisions about your code (modulo branch prediction/cache-fretching). it's all in the compiler and like i said we already have this in many many compilers and it's surfaced in various ways (eg trip count on loops).
There's the reason "token programming language researchers" are incapable of understanding modern computer architectures: a huge gap where would have been EE education.
Correct - most of them think the goal is an infinite tower of sophisticated "abstractions" not realizing there's nothing abstract about a finite piece of silicon.
While I do think there are a lot of reasons to why one should not write in assembly, time-critical base routines can often benefit a lot from being written in assembly (take GMP's mpn-module for instance). However, this puts some constraint on how the code should be formatted (in GMP's case, this means that everything the mpz-module does is derived from the mpn-module), which cannot be said for all applications.
1 - The compiler has more "fingers" than a human does. Back when I wrote programs in assembly I would use printout to keep track of what I was doing and for debugging, and so would often put a finger on the paper to mark where a jump was and then go to the destination to see if it matched up, etc. This process isn't at all scalable, so the compiler is inevitably going to do better than I ever could on anything more than something trivial.
2 - I know more about the program constraints than I could ever express (much less be bothered expressing) in a HLL. "All the values will be less than 16" or "I can use this memory location for something else after I've read from it" or who knows what. So sometimes I can chop stuff out or even rewrite the compiler output locally to speed it up.
Also I can only do this for a few architectures...and with modern x86 there are so many variants with all sorts of microoptimization affordances that I can rarely beat the compiler (this is a corollary of lemma 1)