I think the original author picked this example to broadly illustrate how easy it is to make ad hoc changes to your query without worrying about lot about implementation details. Polars, for example, converges on a similar API and gives you the flexibility. You can iterate then refactor easily later to what you consider good practice.
Author here. At the time I worked in fraud detection and we needed to automate file generation for our BRMS. Initially created this to experiment with “models as dataframe expressions” and Haskell is great for DSL-like stuff. That work is still on going: https://github.com/DataHaskell/symbolic-regression and dataframe has a native sparse oblique tree implementation.
As it’s grown it’s been pretty cool to have transparent schema transformations instead of every function mapping a statement a dataframe you can have function signatures like:
Yeah it's a bummer. It seems that notebooks that support these sort of "reactive" workflows are custom built around that model. Marimo, Pluto.jl, and observable are mostly language specific. Creating one would be non trivial.
Do you have your approach documented (tutorial style) anywhere?
The rule of thumb is somewhere between 5 and 10x difference. Which is large if you're going to do anything heavy but for most practical purposes it's fine. Roughly the difference between C and Python.