> This transformation is affectionately known as sharing
No, it is ubiquitously known as CSE: common subexpression elimination.
The original DAG representation of the abstract syntax, on the other hand, exhibits substructure sharing.
> Of course, that invariant eventually changed. We added a way in the source langauge to introduce lets, which meant my algorithm was wrong.
Because you have to identify variables by more than just their symbol, due to shadowing, like De Brujn indexing and other schemes.
CSE is not particularly difficult in a functional language. Variable assignments throw a monkey wrench into it. Any terms with side effects also.
By the way, CSE can be done over intermediate representations, rather than abstract syntax. In an intermediate representation, you look for identical instructions with the same operands, not arbitrarily large expressions, while paying attention to variable liveness.
Another by the way is that not only do we have to worry about side effects, but we also cannot do CSE on expressions that guarantee returning a fresh object. E.g. if we are compiling Lisp we cannot merge two occurrences of (cons 1 2). The expressions must produce fresh cells which are not eq to each other. Ultimately that is linked to side effects; being able to mutate one cell with no effect on the other. Construction per se is not a side effect, even if it guarantees freshness.
No, it is ubiquitously known as CSE: common subexpression elimination.
The original DAG representation of the abstract syntax, on the other hand, exhibits substructure sharing.
> Of course, that invariant eventually changed. We added a way in the source langauge to introduce lets, which meant my algorithm was wrong.
Because you have to identify variables by more than just their symbol, due to shadowing, like De Brujn indexing and other schemes.
CSE is not particularly difficult in a functional language. Variable assignments throw a monkey wrench into it. Any terms with side effects also.
By the way, CSE can be done over intermediate representations, rather than abstract syntax. In an intermediate representation, you look for identical instructions with the same operands, not arbitrarily large expressions, while paying attention to variable liveness.
Another by the way is that not only do we have to worry about side effects, but we also cannot do CSE on expressions that guarantee returning a fresh object. E.g. if we are compiling Lisp we cannot merge two occurrences of (cons 1 2). The expressions must produce fresh cells which are not eq to each other. Ultimately that is linked to side effects; being able to mutate one cell with no effect on the other. Construction per se is not a side effect, even if it guarantees freshness.