Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Async Overloading (yoshuawuyts.com)
59 points by lukastyrychtr on Aug 30, 2021 | hide | past | favorite | 48 comments


What I wish articles like this would dig into, and why I think the problem is harder than first blush, is how to deal with the actual future values themselves. The audience here seems to be Rustaceans a bit more than "all programmers"; there's a quick into to "here's how Swift does X", which is nice, but how does Swift deal with future values?

That is, it isn't an error to not immediately .await an async function in Rust. It is problematic to discard the future (to never await). But the real power of futures comes from the composability:

  timeout(some_async_op(), 30).await?
The "synchronous context" example is thus particularly confusing, IMO; a Rustacean will look at that and ask "but what makes that context synchronous?" or "how does it know that I want the synchronous version of the function when nothing in the code indicates it?"

  timeout(some_async_op(), 30).await?
          └─────────────┘
             is this a synchronous context?
             (we don't *want* it to be.)
Apparently, Swift introduces a separate keyword:

  async let f = { some_async_func() }
But it seems like all uses of "f" must be "await f", so it doesn't seem like it's really a future. (One could not, AFAICT, write "timeout".) There are some other stuff with tasks & task groups, so perhaps with that, but that's my reading budget for Swift for now…

Also,

> With the overload added, we can start suggesting fixes for errors like this too 2. For example:

    |
  help: try removing `.await` from the function call
    |
  3 | async fn f() {
  4 -     do_something().await;
  4 +     do_something();
    |
I think f() in this example is supposed to be non-async.


It's mentioned in the "Overloading existing stdlib functions" section.

>One issue to be aware of is that unlike Swift we cannot immediately fail if a synchronous overload is selected in an async function. Rust's async models allows for delayed `.await`ing, which means we cannot error at the call-site. Instead we'll likely need to hook into the machinery that enables `#[must_use]`; allowing us to validate whether the returned future is actually awaited — and warn if it's not. Even though this is slightly different from Swift appears to do things, it should not present an insurmountable hurdle.

It's a bit confusing that it mentions selecting "a synchronous overload" but then goes on to talk about it returning a future. I assume it means selecting "(what looks like) a synchronous overload".


I like this proposal, a lot. I believe the prevailing languages of the next decade will solve the colored function problem not by removing colors but by integrating the concept into the syntax so as to achieve the appearance thereof. In other words: make calling either one a simple matter of context which the compiler should carry 99% of the time and the user should only worry about when there's a reason to do so (when the compiler cannot infer or the alternate context is needed explicitly). If the user wants an async runtime, so be it. If the user has no need for one, that should work too. The magic will be in making this seamless so that neither camp feels left out.


At this point, I'm pretty convinced that "async" in 10 years is going to be like "object oriented" is now. A gigantic dead end that we're barrelling down for the wrong reasons--in this instance because everything web must be Javascript and work around all its warts.

Every time someone talks about async, they always forget that you need an Executor, and those executors always have problems because they don't have the boundedness of Javascript.

Using Javascript means your executors can only fire on a very constrained set of events that are completely prescribed. The moment you use something like Rust, that assumption is out the window and now your executor can't just do a couple of sockets and some DOM things. It has to accommodate sockets, timeouts, message queues that might wakeup by getting a message from another threads, character file descriptor devices, a terminal screen refresh, a video card vertical blanking event, etc.

In short, your Executor has to accommodate an infinite variety of events when using async outside of the bounds of Javascript. Effectively everybody has to write their own personal Executor the moment they do something other than an IP socket.

That's not an "improvement" in programming. That's a step backwards.


I'm not certain you're wrong, but not for those reasons. There are still threads for all the long tail of synchronous operations. It's easy enough to do from Tokio or other async frameworks. However, the go approach of making async look like sync code is certainly much easier to write and read code for. I'm skeptical that async is really superior to that.


Futures and channels I think are abstractions that solve different problems. Channels for instance are really good for producer/consumer problems. Futures on the other hand can be extremely elegant and flexible in ways that channels aren’t great at.


That is true, but not the point making. I'm talking about goroutines versus async/await. Both have an async runtime under the hood, but goroutines feel just like normal programming.


That's why I like the idea of scoped executors rather than trying to have a global executor do everything. Libs should provide their own executors and manage their own async event handling while allowing the user to provide their own implementation if needed (e.g. for single threaded test code where performance is not a problem or e.g. in a production environment where you can heavily optimize the executor if your requirements demand it). If you don't call any of the async overloaded functions in a lib, then you don't get any of the async code compiled in, including the runtime.


I don’t think any of this is about bringing the JS execution model to other languages. Rather, it’s about making it easier to build the next nginx.


A major consumer of async needs those seams. You often need the coloring because in many situations you want explicit control of what will yield execution. You might want to hold on to a UI thread or a specific OS thread for interop. A choice of all sync or all async is not good enough either.

There might be some better syntax with scopes but Kotlin hasn't been able to do away with the off the UI thread problem.

I won't say we'll never solve it but I don't think the solution is here yet.


This problem was already solved a long time ago: asynchronous execution is a monad (as is error handling and list comprehension and any number of other fun features), and do syntax is that syntax sugar people are looking for that lets you decide to dip below the abstraction on a case by case basis... only no one seems to want to build support for monads into their language as they think they are too complicated or smell like haskell (is that even a bad thing?!) or something (and then you get awkward stuff like Rust deciding to take the data structure from a popular monad--Either, as Result--but providing none of the scaffolding or behaviors or synergistic language features to make it pleasant to use :/).


I'm not suggesting one or the other all in and neither does the article. I think you'd be quite happy with the full proposal. You can control whether you get the async or sync version by calling from an async or sync closure, respectively, essentially.


Yes, I think we mostly agree. I was trying to add some context to the problem.

I apologize that my slightly more pessimistic take on where we currently are was construed as disagreement.


I think you need a way to be explicit on which override you want to call.

So in addition to the await and the implicit sync:

    do_something().await;   
    do_something();    
You also need a way to directly call the sync and async version:

    do_something().sync;   // I really want the sync version
    do_something().async;  // I want the Promise result


Agreed. Though, I think that in most contexts the compiler would probably be able to disambiguate which version is intended from type resolution. So maybe it would be better to be able to optionally specify whether we want the sync or async version, possibly overloading our usual polymorphic syntax, e.g.

  do_something().await;  // Its clear that we're using the async version
  do_something::<sync>();  // I really want the sync version, even in async contexts


I wish Rust would just make everything (that can be async) in the stdlib async, but also provide a simple primitive for "block the current thread on this async call", and guarantee that all of the stdlib functions that are async work when called with that primitive. Something like "block!(async_fn())" or "async_fn().block". Or even just allow non-async functions to .await, with the meaning of "block the world until this finishes running"

What am I missing that makes this undesirable?


First and foremost that Rust is a systems programming language that shouldn't have any kind of implicit runtime.

If you run any implicit eventloop, that property would no longer be given.

Next, it wouldn't even be desirable for lots of applications. Async IO and functions are not necessarily be faster than synchronous operations - it might as well the opposite if you don't have a lot of concurrency. E.g. if you read from one blocking socket, you do a single `read()` syscall. Add async IO, and you need an additional `select/epoll_wait` call.

Then there is an actual cost for composing Futures which are large values on the stack, sometimes having to box them (since otherwise recursion won't work and dynamic dispatch will neither, etc). The latter might certainly be avoidable with a different kind of "async design" than what Rust currently has, but there will always be some tradeoffs.


C and C++ happen to have a runtime, or whatever you feel like calling the code that calls main(), runs functions registered on executable load, registered atexit(), soft floating point emulation, running constructors/destructors on start/exit, handling exceptions, threading support.


But it's not implicit and/or required. In embedded you can use C/C++ without a runtime.

> calling the code that calls main(), runs functions registered on executable load, registered atexit(), soft floating point emulation, running constructors/destructors on start/exit, handling exceptions, threading support.

You call main() (or whatever) from your reset handler, there is no "executable load", there is no "atexit()", floating point can be hard-instructions (if available), call __libc_init_array() to call constructors if required, you usually don't use exception handling, and of course there is no threading support.

Perhaps you can call "runtime" the instructions inserted to call constructors/destructors but I think it doesn't qualify.


A runtime is everything that the compiler generates in addition to the code that is written so that it runs at all.

The embedded use of C and C++ dialects rely on compiler specific extensions and are per ISO standard implementation-defined, without any guarantee of code portability as per freestanding definition.


That would mean a runtime in the standard library, which was explicitly rejected. Different runtimes have different pros and cons and different use cases, choosing one and putting it in the standard library means that all others are second class and probably not well supported. For example, some might want an io-uring based async runtime, while others want a runtime-per-core system for a web server, and another might want a simple lightweight runtime for an embedded environment. Someone else wants to use raw OS primitives without ever spawning a runtime (to use block_on and an async function that does IO, you still need a runtime running in the background). A better solution is to provide standard interfaces for runtimes and libraries to develop against, something that is being worked on.


Because then you mess up all the other async tasks that would be spawned by the callee and possibly deadlock your program (without an async runtime in the stdlib).

If you're using `async_std`, right now you simply, explicitly, introduce the runtime:

    let fut = async_fn();
    let res = async_std::task::block_on(fut)?;
(pulling out the Future into fut is just illustrative, it can of course be a single line) instead of:

    let res = async_fn().await?;
So it's already pretty darn easy to do what you want.

Also, here's an easy way to make it feel like a keyword:

    use std::result::Result;
    use std::error::Error;

    use async_std::task::block_on as block;

    fn main() -> Result<(), Box<dyn Error>> {
        let fut = async_std::fs::read_to_string("./Cargo.toml");
        let file = block(fut)?;
        println!("{}", file);
        Ok(())
    }
    
    ---
    
    [package]
    name = "async-sandbox"
    version = "0.1.0"
    edition = "2018"

    [dependencies]
    async-std = "1.10"
One final thing to point out here is that, unlike some other languages, in Rust you can call any async function just like any normal function. The only thing that requires a runtime is resolving the returned future. So it's not so much that functions are colored but that resolving futures needs a runtime and there is none in the current standard lib.


I think the OP meant that those who did not want async would use block_on, while the async API would just work for those who wanted to use it.


I tried to edit for clarity. The issue is the "bringing in the runtime", or the `async_std` part as you point out elsewhere. I guess I don't see a huge benefit in making a keyword like `block` that aliases `async_std::task::block_on`. Indeed you could just `use async_std::task::block_on as bock;` and be on your way.


There are a couple of different objections to this, though there is also a lot of support. The current state of it is "someone needs to write up an RFC."

If you want the full gory details: https://github.com/rust-lang/rust/pull/65875


Well that's just the "simple primitive for 'block the current thread on this async call'", which can, with the new `Wake` trait, be written in around 10 lines of code. The real change would be adding an actual runtime to the standard library.


Potato, potato. Both involve adding some form of runtime, now we’re just negotiating details.

(I personally would like a very very simple one like the ten LOC version, but wouldn’t want anything more.)


Well the block_on runtime is only meant for testing or bridging sync and async code, so it wouldn't solve the problem :)


It solves my problem, which is why I like it as a solution (documentation, examples, and tests), haha!


This is the approach Microsoft took with WinRT, and Google with Android, to force devs to go fully async.

It did not work that well, as not everyone is confortable in an async only world.


Wouldn't you be using 2 threads for at least some of the duration of every instance of that? Seems like it would scale very poorly.


I'm not much a fan of adding all these custom keywords to function signatures. I hate to use use the m-word, but it really seems we are dancing around how to express monadic concepts in the language.

I should note that I don't really want monads in Rust, I like them in Haskell but I've never felt Rust really needed full HKT & monads. It just feels wrong to me that we might add more syntax to the language & fn signature for each of these discrete concepts. So we have async, now we add the 'default' fn "color" and the "try" fn color, more colors feels like the wrong way to solve this problem to me. I don't know the answer here it's just a sense I get... anyone else?

I was really taken by carl's post on async and how they discussed how .await could be removed https://carllerche.com/2021/06/17/six-ways-to-make-async-rus.... If we're concerned about having "colored" functions, maybe this is the avenue we should be exploring rather than lifting all the different colors out of the type signature and into special keywords?


The reason you get that sense is that we as programmers are drawn to the idea that generic solutions are the correct ones, and that special cases are bad. A HUGE part of our disciple is about creating and then welding abstractions.

However, as Alan Perlis famously said, “a programming language is low level when its programs require attention to the irrelevant.“ In general, Rust’s default stance is that these details shouldn’t be abstracted away, because they actually matter.

This is the fundamental tension that exists here.


This obviously doesn't change anything, but shouldn't the example imports in the first code block use `std::fs`, not `std::io`? The existing function in the standard library is `stf::fs::remove_file`


> When designing an API in Rust which performs IO, you have to make a decision whether you want it to be synchronous, asynchronous, or both.

Why must that be true? Why can't you write the interface once, and have concurrency be an implementation detail?


Because the design of async in Rust precludes that. Some of the details are leaky. For example, recursive functions require dynamic allocation in an async context because the async state must be statically sized. See https://rust-lang.github.io/async-book/07_workarounds/04_rec...

There's no easy way to get around the function color problem unless you went the way Go did. But Go's choice made C ABI interoperability more complex. Rust chose simpler C ABI interop, at least for the sync case--no matter which choice you make neither approach makes async interop seamless.

The fun part will be seeing how Rust integrates async and fallible allocation. Both of these issues you could see coming from 10 years away, and also see how they'd interact, but Rust devs decided to punt on some of these hard decisions early on.

This sort of wheel reinvention is what you typically see in every new language, unfortunately, and you typically see them resolved in much the same way because solutions are path dependent on very early design decisions, and almost everybody makes the same early decisions. Except for Go. Go made the decisions it did because the designers had decades of language design experience, including decades of async experience under their belt. Rust designers came with a different set of experiences and goals, and this shows. (Not saying Go is better than Rust--in fact, non-fallible allocations was always a show-stopper for me in some critical niches. But Go made the most difficult decisions up front, and that included putting async first.)


Rust did not punt on concurrency early on. The seventh word used in the first sentence ever describing Rust is "concurrent" http://venge.net/graydon/talks/intro-talk-2.pdf It even comes before "safe"!

What did happen was that the ways in which concurrency was implemented changed as other design constraints on the language changed. But 1.0 wasn't released until we knew what the concurrency story for Rust would be, even if sorting out all of the details took a few years.

"leaky" is in the eye of the beholder. Yes, if you think this should be abstracted, then it's a leak. But not everyone thinks that it should; many things about concurrent vs sequential are different, and Rust likes to expose certain kinds of costs and promises in the type signatures of functions. For its core audience, this is not a leak, this is giving you important information about the context the function should be used in.


> Rust did not punt on concurrency early on.

Rust started with green threading/fibers then quickly rejected that approach. Then it spent years iteratively building an alternative solution, which is still underway.

That's punting in my book; and it's punting for the majority of Rust aficionados who are surprised by the various twists and turns things take as the solution (as inevitable as it is) slowly materializes. By contrast, nothing of substance about Go async has ever changed, except perhaps the change in the default value of GOMAXPROCS. It was complete at conception.

It wasn't a wrong decision that Rust made; it was just a choice. But 10 years out it's not entirely implausible that if Rust had stuck with fibers that it may have driven the required OS improvements (e.g. Google's User Managed Concurrency Groups (UMCG) Linux kernel patches) that would have resolved some of the issues. It's not like Rust has become ubiquitous in the embedded space either, considering that it's held back by LLVM in that regard.

Something similiar happened with fallible allocations. Very early on most Rust devs declared that they believed that attempting recovery from allocation failure was folly (which in the land of GUIs from whence most of them came was the near universal opinion), and so shot down attempts to consider fallibility in the APIs.[1] Cue 10 years of slowly walking that back, with iterative (and still mostly pending) changes that were less than ideal owing to the fact that handling allocation failures is made infinitely more difficult if you don't take it into consideration at day 1.

[1] And, no, it's not enough to say that Rust core is allocation agnostic, because setting aside that only a tiny minority of Rust programmers only stick to core, the decision involved setting idioms and practices surrounding panics.


Okay, we understand the word “punting” extremely differently then. Active work is the opposite of punting.

And yeah, maybe in an alternative universe where everything is different, things would be different. But that zero-cost C interop is one of the only reasons Rust succeeded enough to be making it to the point where it’s conclusion is considered, let alone re-writing all of the primitives of all the current OSes and waiting until those are widely deployed enough to be able to use only them and ignore all those embedded users. I don’t possibly see a universe where that works out. But in theory it could happen I guess.


> But 10 years out it's not entirely implausible that if Rust had stuck with fibers that it may have driven the required OS improvements ...

Rust would've been a language nobody had heard of. It wouldn't have driven anyone to do anything because it wouldn't have had widespread adoption in the first place. As-is I'm using it in embedded programming and loving it. I certainly would never have picked Go for that.

I think you're not really understanding Rust's approach here. Green threads would be much too heavyweight to build into the language itself, and would mostly preclude it from being seriously used in interesting domains like embedded programming.


Since I can't edit the above comment, one additional thought is that your criticism is valid for a language like Python, which has a much higher tolerance for abstractions such as green threads being built into the language itself. It would've been interesting if Python had gone with a solution other than async/await.

I'm also a little confused when you say "Rust started with green threading/fibers". I think the term "fiber" is overloaded here, but Rust did start with green threads (M:N preemptive multitasking). Rust now has support for cooperative multitasking via async/await. From what I can find of UCMG, it kind of misuses the term fiber. Looks like it's just green threads that the kernel is aware of?


It is an interesting thought experiment to imagine how things might have played out if Rust had kept green threads.

I don't know if it would have changed the Rust vs Go story much. There'd still be the learning curve of the borrow checker, and many people using Go aren't necessarily doing a ton of concurrency.

The Rust vs C/C++ story on the other hand... If Rust had a bunch of extra runtime stuff and limited interop, I suspect it would not have been perceived as a serious replacement, which may have hindered adoption. At least when I selected a language for my highly concurrent network server, I only considered C, C++, and Rust.

I'll admit it's a fine line. I'm using async Rust, with an executor/reactor and all that jazz, and the end result may not be much different than if Rust had made those decisions for me. Having the power to make my own decisions is very appealing though. It's possible Go could have been a contender for my project if it wasn't such a limited language. And Mozilla's backing of Rust helped it vs other fringe options.


Just as an example of how many years this might take, .NET was one of first ecosystems to bring async/await into mainstream, with C# 5 (2016) and the upcoming .NET 6 and related languages are still improving the whole experience.


Ideally:

1. The compiler would be able to turn a subset of async code into sync code with no runtime cost

2. Awaiting would be the default, with `.await` deprecated and special syntax for getting a raw future instead

That way most code would look the same regardless of being executed synchronously or asynchronously, with exceptions for evaluating multiple expressions in parallel and such.

But #2 probably implies lazy evaluation semantics for all expressions!


Because synchronous IO functions block the current thread and return the value directly, while asynchronous function return a `Future`, which will eventually resolve to the value, and can be polled concurrently with other futures as to never block.

    fn sync_read() -> Vec<u8> { ... }
    fn async_read() -> impl Future<Output = Vec<u8>> { ... }
    // the second can be written more succinctly as:
    async fn async_read() -> Vec<u8> { ... }


It not possible with rust current abstractions you would need:

- higher kind types / type constructors

- some features to handle differences in the auto-traits depending on the result of the type constructor

- some magic to resolves that

---

- OR namespace overloading e.g. read_to_string$sync and read_to_string$async and magic to resolve that

But async has A LOT of implications which change subtle things around handling it, like e.g. the handling of lifetimes/borrows, Send, Sync, etc.

So this probably wouldn't be worth the complexity it introduces.


Because async based APIs need to return a promise type of some kind (eg Future). You can kind of auto-generate wrapping code to convert synchronous to asynchronous (with some performance cost that may be undesirable) but you can’t generally do the reverse, unless you try to do funky things like pausing/resuming user space fibers (and then issues of lock inversions and things come into play there in a systems level language).


Because rust sync versus async colors the functions and how they're called.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: