Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This is not a complete answer, but covers some languages. If you program too exclusively in dynamically-typed languages, you can be too used to not thinking about how physically large your types are, because you work in a world where everything is boxed, and allocations so plentiful you don't even hardly have a way of thinking about them because your language does them at the drop of a hat, and so on.

But there are many strongly-typed languages that look at types at compile time and decide how large they are, by which I mean, how many bytes of physical RAM they take. These languages have gotten better and better over time at the APIs they offer that give the benefits of this without incurring the programmer costs, but under the hood, no matter how slick they've gotten, there is still a mapping of type -> size & layout. In these languages, this sort of narrowing is a great deal more complicated, because it implies a new type, which will probably require more allocation. For instance, in the example given between a "string" and a "number", that definitely looks like two differently-sized types to me. This would become very complicated very quickly as you start having to allocate these things in ways that can't be transparent to the user, you have to pass these types around, etc. etc.

One would expect that the affordances of statically-typed languages would end up being very different. And in fact, they are. Many modern statically-typed languages can solve the same problems outlined in the post, they just solve it in a different way. In this case, an Either, or more generally, a sum type. Then "getViewsOf" returns a value of that sum type and you can switch on it based on the value. It isn't exactly the same, in fact I'm not sure there's one aspect of it that's the same let alone the whole, but it solves the same problems.

So my suggestion would be that there are in fact a lot of languages that essentially have the same feature. They just don't call it "flow typing" and it's not spelled the same way.



Dynamic and strong typing are not opposed (dynamic and static typing are opposed, and, to the extent the distinction is valid, weak and strong typing are opposed), dynamic doesn't mean everything is physically boxed (most dynamic language implementations don't box some subset of small primitive values, usually including bools and small ints), and, in any case, flow typing is a feature of static type systems (though some of those offering it are optional static typing systems for dynamic languages.)

But you kind of get at the real answer toward the end despite talking around it most of your post: Sum types plus pattern matching solves the same set of problems union types with flow typing solves, and a bunch that the latter doesn't (like composability.)


Remarkably, a post written in generalities speaks in generalities. None of those things may be "necessary", but there sure is a lot of all the things I said, aren't there?

What you call "speaking around" I call an important thing to understand about a lot of languages: At some point, your data will be physically laid out in memory. If you don't care about performance... and I mean that as a perfectly valid choice in many situations, not a snark... it doesn't much matter how it is. But if you do, and you selected a language based on that, it matters a lot, and you have to view every type system detail through the lens of "what does it look like in memory?" if you want to understand it. The choices this class of language makes for their type systems will never fully make sense if you are not paying attention to this, and also if you gloss over the legitimate difficulties that arise for any sort of "why don't they just...?" sorts of questions.

(In particular, you really don't understand how good Rust is until you look at it through this lens, then look at just how much simultaneous power and convenience they've drawn out while never compromising on the memory issues. It's an incredible piece of work. It probably isn't "optimal" because such things don't exist in our real universe, but it probably is as close as we'll ever see for that class of langauge.)


> dynamic and static typing are opposed

Any given typing judgement we might want to make about a particular term can be checked statically or not, and it can be checked dynamically or not.

If the language has terms (... that might be evaluated) that don't get checked statically or dynamically, then it is going to be weakly typed (with respect to those typing judgements we're considering).

If we've already checked something statically, we "know" that it will always hold, and checking it dynamically is in some sense redundant (provided we actually believe our checking to be correct and our environment to be sufficiently robust that things don't change out from under us) and I think that's a part of the reason the two feel opposed, but in principle you could check a property both statically and dynamically for a particular term.

Step back a little to the language level and they aren't at all opposed; you might want to check some properties statically and some dynamically (eg. static checking of class/interface vs dynamic nullability in Java) or check a given property statically for some terms and dynamically for others (eg. gradual typing).


Something that's checked dynamically is not a "typing judgment", by definition. It's a proposition established at runtime, possibly even with non-trivial data attached describing "how" the dynamic check succeeds, and possibly affecting program operation in its own right as that proposition object gets "passed" to downstream functions that depend on that dynamic check. These are two altogether different things.


Sure. Tip: if something's true by definition, it's usually not interesting. Substitute the appropriate phrase - "observation of a property which, in the static context, might constitute a typing judgement"? Note that we're here specifically considering things that can be typed statically, as outside of that setting there is no question of whether static and dynamic are opposed.

The rest of your comment seems concerned with the fact that the tagging infrastructure typically needed to check these properties at runtime can also be used for other purposes. That's true, but I don't see the relevance.


That probably reads a little snarkier than appropriate. My apologies for the tone.


I think swift combines sumtypes with flow typing in an interesting way, where if you use a guard statement it then propagates that information into future uses of the type. If I remember correctly Kotlin also does something similar.

Pattern matching and sum types help, but they solve different problems.


Isn't this sort of an orthogonal problem? Flow typing implies that your type is in some way unknown at the time the code fragment is evaluated. I think this can mostly happen in two situations:

1) The type is generic: A function may be called with a different type on each call site - but for each particular call site, the type is known at compile time.

2) The type is polymorphic - i.e. the full type is not known at compile time at all.

When the first case occurs, a compiler could either go the C++ way and generate different, non-generic variants of the function - or go the Java way and just pretend the type is polymorphic.

For the "C++" option, I think flow typing would fit very well without introducing additional complexity: For each variant the compiler generates, it knows which concrete type is used, so it can evaluate the condition at compile-time and just drop the branches that don't apply for a particular variant of the function.

For polymorphic types, on the other hand, the additional complexity is already priced in: You need some kind of mechanism to represent values of different structure anyway (pointers, unions, whatever) - so flow typing wouldn't be much more than some syntactic sugar to save you from writing casts you'd have to do anyway.


Another very common case in TypeScript (and other languages) is union types. Other languages have constructs for checking the underlying type of a union type directly, but typescript is smart enough to figure it out based off of the properties of the constituent types and what you've checked so far in your code.

e.g.

    interface A { type: 'a' }
    interface B { type: 'b' }
    type C = A | B;

    const c: C = ...;
    if (c.type === 'a') { /* c is of type A here */ }


That's basically case 1, right? The compiler knows which type is being used at each call site, so it can generate a separate function for each type in the union and eliminate the type check/dead branches.

I guess the exception would be if you have a non-homogenous array that you try to map over. In that case there's probably no way around boxing the values.


For consts and function arguments yes, I think. But you could have some mutable variable or field whose value depends on runtime state. In that case, you could have an actual polymorphic type.


But the previous example isn't that case.

if (c.type === 'a') { /* c is of type A here */ }

This is dynamic typing, this code is checked at runtime, and it's leveraged statically.


Yes, that's correct. In the GP example, c was const, so the type can be determined at compile time.


I was writing TypeScript, where const just means the variable is not reassigned. For instance this is valid:

    const c: C = Math.random() < 0.5 ? { type: 'a' } : { type: 'b' }.


Ah, I'm sorry. You're right of course!


It actually happens all the time, which is why the suggestion is valid (namely that flow typing is useful). Flow typing is deeply related to union types, at least for expression-based languages.

Take rust. You have a function that on one path of a branch returns an int, in the other, a float. Rust says that's an error. I say your function returns a union type :)

And this is totally pervasive. A function which returns an int in an if branch (w/out an else) and elsewhere returns nothing actually returns an optional.

A variable which is assigned in one branch to a reference to an int (an l-value in languages that screwed this up, like rust; a ptr int in languages that, sensibly, use types to encode this property, like algol68) in one branch, and to an int value in the other, is technically a union type. If that union is added to a to a float - that shouldn't be an error! The int should be widened, the ref int should be dereferenced, then widened.

OTOH, if you try to assign an int to that union - now that is an error, because values can't be assigned to. You can only assign to types which represent a location in memory, like ref int, unlike int.

In the above discussion, the effect of control flow on the types is critical. Languages like rust ignore this, and to my mind, their ergonomics suffer considerably because of this. C++ side-steps this through statement based syntax, which makes variant types clunky.

Union types are beautiful and powerful, but there seems to be a lack of appreciation for the subtle and deep ways they impact language design. You have to design them in right from the beginning; it's a tricky thing thing to get right, and doing so absolutely requires flow typing.


Sum types are often used in cases where the type is unknown at run time. In C++ the analog would be std::variant. In Rust it would be enums. In Haskell its algebraic data types.

A first class sum type in c++ ala Rust or Haskell would certainly be appreciated, as the clunkiness of std::variant/std::visit is a well known annoyance.


Could you expand on your first sentence? I don't understand what do you mean that sum types are used when the type is unknown at runtime.


They meant unknown at compile-time, not run-time.


> In these languages, this sort of narrowing is a great deal more complicated, because it implies a new type, which will probably require more allocation.

Flow typing need not imply any kind of "narrowing", though. It could simply endow a control-flow branch with a statically-typed proof object, asserting whatever post-conditions were established in the branch. The "narrowing" could then become a separately-coded step, taking that proof object as part of its input arguments.


That sounds great at the napkin-sketch level.

But if you mean what I think you mean, now your entire compiler suite needs to be reworked to change its internal concept of what guarantees a "type" has. Previously the compiler was able to assume it was a contiguous chunk of memory, but now it needs the concept of a non-contiguous chunk of memory representing a certain type, and it needs everything everywhere updated to accommodate that, and it may need to add code to make copies for things that didn't used to need to make copies if you create an array of this new type, etc. It goes on and on.

I'm not the world's expert on all this under-the-hood stuff, but there is still something to be said for learning a bit of C or something deep enough to understand how it interacts with the machine, and C probably is still the best language for this despite its manifold other flaws. There's all kinds of fun games you can play in conceptual math space to make awesome programming languages, but if you want your language to play in the "goes fast" space you will find only some of those games map to anything efficient on real machines. (There is certainly a space for languages that just provide the nice abstractions, too; between "C performance" and "Python performance" there's a lot of room to play!)


Flow typing is purely a static analysis process. The representation of the type doesn't change, just the range of possible values in a scope constrained by some runtime condition.

Some things that might have triggered warnings no longer need to, because if the runtime conditional evaluated to true there's no way the warning could take effect.

Consider this little bit of C code:

  void foo(int x, unsigned y) {
    if (x > y) {
      printf("but it's bigger!?");
    }
  }
Due to the idiosyncrasies of C if you call foo(-1, 1) the test will promote x to an unsigned and the test will say x is bigger than y. There's a warning in most C compilers for it, because it's the sort of invisible bug that'll wreck your code.

What's annoying, though, is if you do this:

  void foo(int x, unsigned y) {
    if (x >= 0 && x > y) {
      printf("but it's bigger!?");
    }
  }
And still get the warning. The compiler should be able to infer that the test x >= 0 means there's no fault path any more. Clang 13, at least, still warns.

A C compiler using flow typing would not issue a warning for the second implementation of foo. Neither x nor y change representations. The execution path through your compiler for producing byte code doesn't change. A few of your tree structures might pick up an extra annotation for the provable facts the warning generator can use to improve its output, that's all.


Pretty much. Flow typing like TypeScript does it is only really sensible when your run-time supports dynamic typing but your compiler checks types statically. In other words only a language similar to TypeScript could have a type system similar to TypeScript's.


Way before 1.0 (IIRC), Rust in fact used to have structural records, and wanted to try to only have them. But due to the challenges described by you and the article, they went back to regular nominal ADTs.

I do think think that was the right call at that time.


In the end, I feel like structural typing is one of those attractive nuisances. It sounds good on the outset, and it does provide a limited amount of convenience, but then it turns out not to be that convenient, and it removes a lot of safety and flexibility by taking intent out of the thing.


I think the majority of the convenience of structural typing can be had with explicit casting between structurally equivalent types. It eases pressure on the type checker (checking that two types are structurally equivalent is pretty quick, checking that any expression satisfies something structurally is a little harder) and gets you ultimately what a programmer wants - convenience.


Go's interfaces are a decent 80/20 answer. You pay for them, you opt in, and they sit on type of the base type system, they aren't the primitive the entire type system is based on and they aren't what the compiler works with at the most basic level.

I have on a number of occasions wished I could define an interface that could be fulfilled by a field without me having to write a method, but it's not a use case that comes up enough to bend a type system design around. (And there are enough other workarounds, like embedding a custom type and such that it's fine.) I think a lot of people underestimate just how large every last detail of every design decision looms at this level of a language. Too many languages doing something slightly convenient for a relatively small set of flashy use cases without fully accounting for or realizing the costs it imposes.

(This is a pretty hot take but IMHO Python is almost imploding itself constantly adding features that match that description.)


> Go's interfaces are a decent 80/20 answer. You pay for them, you opt in, and they sit on type of the base type system, they aren't the primitive the entire type system is based on

That's interesting because I see them as the exact opposite. You're not saving or gaining anything of significance, and it's just increasing confusion.

Penny wise, pound foolish, if you will.


I'm not sure what you mean by "not gaining anything of significance." Go without interfaces (and for simplicity let's ignore 1.18's generics for a moment) would not be a useful language at any significant scale. You'd end up with a lot of "by hand" interfaces of structs full of function pointers (or method closures), only without the compiler support.

Nor am I sure exactly what "confusion" is being increased.


> I'm not sure what you mean by "not gaining anything of significance." Go without interfaces (and for simplicity let's ignore 1.18's generics for a moment) would not be a useful language at any significant scale.

Why would go not have interfaces? Where did you make that up from?

Go would have nominative interfaces Like most other languages.


There is a fundamental tradeoff between cost-free upcasts and efficient layout, but languages could offer both options and many in between.

IMO in general: in the face of tradeoffs, moving the goal posts to let the programmer decide is an unexplored 3rd option.


> There is a fundamental tradeoff between cost-free upcasts and efficient layout, but languages could offer both options and many in between.

Not sure what the relationship to my comment is there, this tradeoff exists in nominative land, it’s got nothing to do with structural typing.


That depends on where it's used. For example, Rust's traits bounds are structurally typed, for exemple making a function that accepts a type that's Summary + Display. This "Summary + Display" type doesn't need to have a name, and in that case it's great. On the other hand sometimes you want to have two strings, one that's a name and the other the address, and not substitute one with the other. I don't think there is "a" solution, only multiple solutions and tradeoffs.


> For example, Rust's traits bounds are structurally typed, for exemple making a function that accepts a type that's Summary + Display. This "Summary + Display" type doesn't need to have a name, and in that case it's great.

That's still nominative. The bound is not named, but it's based on names, not on structure. Otherwise you couldn't have marker types in such a bound, or would always have all of them and be unable to opt out.

> I don't think there is "a" solution, only multiple solutions and tradeoffs.

I have a hard time seeing that: if you have a structure, you can always give it a name. But you can also give different names to the same structure. So a nominative type system is more flexible and powerful at the very limited cost of having to name things. And even then, nominative type systems can support anonymous types (where the name is simply implicit / automatically generated) e.g. lambdas in C++ or Rust.


I think that's structural. My understanding is the bound is based on structure, as there is no way to differentiate between a Summary + Display in a function and a Summary + Display in another. On the other hand, with Rust's enums (that are nominal), you can have enum A { Toto, Tata } and enum B { Toto, Tata }, and both with be different. A function accepting A will not accept B. A function accepting T : Summary + Display will accept any type that implements both traits.

From my understanding, nominal vs structural typing is about how you consider group of types. For structural typing, types that contain the same thing are the same. For nominal, that isn't the case.


I entirely agree with you! And all of these are things covered in the post.


Yeah, I hadn't heard the term "flow typing" before but I thought the same thing, in other languages I've used you'd use a Result type and a switch to do what the author was doing.


It sounds like you are very confused about types and values. Static type checking does not incur any extra allocations at runtime.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: