IMHO the best way to think of it (well, it's how *I* have long thought of it) is...

almostgotcaught · on July 12, 2024

A compiler is a combinatorial optimizer (think bin-packing). In general, optimizers/solvers basically search for the best solution. Most production compilers don't have solvers in them, they use heuristics instead, but even the best solvers use tons of heuristics. Naturally a computer will search/try heuristics faster and more thoroughly than you but sometimes you can do better because performant searching is all about "knowing where to look".

BeeOnRope · on July 13, 2024

Modern compilers are not doing much searching in general. It's mostly apply some feed-forward heuristic to determine whether to apply a transformation or not.

I think a slower, search based compiler could have a lot of potential for the hottest parts you're willing to spend exorbitant time on a search.

jonjojojon · on July 13, 2024

I have heard search based compiler optimization called "superoptimization"[1]. It seems interesting, but as far as I know has not seen much industrial use.

1. https://blog.regehr.org/archives/2578

kaba0 · on July 13, 2024

It simply doesn’t scale. You can only superoptimize very short runs of code, nowhere anywhere close to even smaller code bases, let alone big ones.

fooker · on July 13, 2024

It scales well enough. You can apparently run Souper on SQLite in 24 hours with a beefy machine, according to a talk I recently attended, by one of the developers.

kaba0 · on July 14, 2024

That’s a cool data point, thanks! Though mind that it is a superoptimizer on LLVM bitcode, not on machine code itself, avoiding all the combinatorial explosion of register allocations and probably a bunch of more, but I don’t know enough about the program.

gumby · on July 13, 2024

I understand the compiler Microsoft uses to build release versions of Windows, Office etc is like this and can take days to run.

flamedoge · on July 13, 2024

and well worth the cost

almostgotcaught · on July 13, 2024

> Modern compilers are not doing much searching in general.

This is false. Any compiler that does register allocation and instruction scheduling (all of them) is searching for an optimal (or just good enough) solution to an optimization problem.

WalterBright · on July 13, 2024

Where things get fun is when two optimizations combine to make things worse. They never tell you about that in compiler class!

It's like designing a house. If you want the master closet bigger, the master bath has to shrink. Everything is a tradeoff.

almostgotcaught · on July 13, 2024

You have this exact issue in VLIW and it's exactly where you can't cheat by using heuristics:

Combinatorial Register Allocation and Instruction Scheduling

https://dl.acm.org/doi/10.1145/3332373

drpixie · on July 13, 2024

Can you give an example?

In general, searching any significant space is very slow, applying heuristics is much quicker.

azakai · on July 13, 2024

I'm not sure there is a clear separation between applying heuristics and searching a space. Often in compilers you search a subset of a space using heuristics, and you can adjust those to control how much of the space you cover.

For example, here is a pass that reorders WebAssembly globals in the Binaryen optimizer:

https://github.com/WebAssembly/binaryen/blob/main/src/passes...

We have a simple criteria for the quality of a solution - how big the binary size is with an order - but the space of possible orders is huge (every permutation that keeps every global after its dependencies). What we do is a targeted search of that space using some heuristics using parameters that work well enough and aren't too slow in practice.

almostgotcaught · on July 13, 2024

> I'm not sure there is a clear separation between applying heuristics

There is and it's quite simple: if your heuristic reduces the size of your search space faster than it takes to perform the search (ie try solutions) then you have a real algo on your hands. Otherwise you're just searching. This is basically the border between P and NP and it's just that in compilers most of the problems are NP hard so none of the heuristics are really that good.

gumby · on July 13, 2024

I disagree with this part of your comment:

> then you have a real algo on your hands

To me an algorithm is closed, while a heuristic (aka rule of thumb) is just a fast way to probably get a better solution / subset of the solution space at the cost of possibly missing the optimal result or even ending up in a pessimal corner case.

With an NP complete problem you'd rather have some solution rather than use up your lifetime searching for the best.

almostgotcaught · on July 13, 2024

> To me an algorithm is closed

i dunno what "closed" means here? converges? lots of things with heuristics converge...

most things people think of as algorithms are just heuristics codified. take for example unit propagation in DPLL (fairly modern SAT approach); quoting wiki[1]

> In practice, this often leads to deterministic cascades of units, thus avoiding a large part of the naive search space.

that *in practice* there means there's no guarantee because ofc not - otherwise they would've proven something about NP. but they didn't, they just came up with a way to search that's sometimes/often not bad. a lot of people call this an algorithm ("... is a refinement of the earlier Davis–Putnam algorithm") because it fits the dictionary definiton (a repeatable process) but i do not because it's a repeatable process that isn't proven to produce anything (ie faster than brute force). and the intuition i'm proposing for why it doesn't is because it doesn't actually shrink the search space fast enough.

note, my definitions/intuitions don't translate/have the same force elsewhere (especially in continuous/differentiable spaces) but they're a pretty standard/obvious perspective in combinatorial optimization (np-hard/np-complete problems).

[1] https://en.wikipedia.org/wiki/DPLL_algorithm#The_algorithm

almostgotcaught · on July 13, 2024

I don't know what you're asking for - this isn't some kind of controversial topic - any iterative algo that isn't polynomial time (or is approximate) is search.

In the context of compilers there are many. Look at this block diagram for Chaitin's register allocator:

https://en.wikipedia.org/wiki/Register_allocation#Principle_...

That's a search because it tries an allocation, possibly incurs a spill, tries again.

ashton314 · on July 13, 2024

I am going to be the token programming language researcher and say that what you really want is a dependently typed assembly language that your dependently typed higher level language lowers to. One school of thought that has yet to bear fruit in “mainstream“ programming, but gives tantalizing hints of what is possible, is expressing increasing amounts of your programs constraints in the type system, thereby informing the compiler precisely about your intent.

almostgotcaught · on July 13, 2024

> want is a dependently typed assembly language

Doesn't make any sense. The "type constraints" on assembly operands (registers and numbers) is the ISA and thus those constraints are combinatorial not logical

ashton314 · on July 16, 2024

You misunderstand. If I, at the assembly level, could express things like "this register ranges from 0 to 4"—I might be able to provide ever-lower levels of the stack with more information about my intent. Maybe the ISA can now do something clever to further optimize my program. Now, there will likely always be a time when all types are erased, but if we can push that erasure lower and lower in the stack, then compilers have more and more information about what we actually want them to do, provided our type systems are rich enough to express that intent.

almostgotcaught · on July 16, 2024

> "this register ranges from 0 to 4"

that's not a type system, that's an ILP. and we already have that in many compilers (bindings to ILP solvers).

> Maybe the ISA can now do something clever to further optimize my program.

this is a malformed sentence - the ISA is fixed and isn't making any gametime decisions about your code (modulo branch prediction/cache-fretching). it's all in the compiler and like i said we already have this in many many compilers and it's surfaced in various ways (eg trip count on loops).

ashton314 · on July 17, 2024

Those who say it cannot be done should not interrupt someone doing it.

https://www.cs.cmu.edu/~rwh/papers/dtal/OGI-CSE-99-008.pdf

It's not perfect, but dependently typed assembly languages isn't something I just made up. Maybe there's a mismatch between the terms we're using?

tucnak · on July 13, 2024

There's the reason "token programming language researchers" are incapable of understanding modern computer architectures: a huge gap where would have been EE education.

almostgotcaught · on July 13, 2024

Correct - most of them think the goal is an infinite tower of sophisticated "abstractions" not realizing there's nothing abstract about a finite piece of silicon.

albinahlback · on July 12, 2024

While I do think there are a lot of reasons to why one should not write in assembly, time-critical base routines can often benefit a lot from being written in assembly (take GMP's mpn-module for instance). However, this puts some constraint on how the code should be formatted (in GMP's case, this means that everything the mpz-module does is derived from the mpn-module), which cannot be said for all applications.