Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Asyncio, twisted, tornado, gevent walk into a bar (bitecode.dev)
195 points by BiteCode_dev on Aug 22, 2023 | hide | past | favorite | 77 comments


Great article. I had the misfortune of writing a bunch of twisted code over a decade ago and I wanted to remind the author that twisted has a feature called "inline callbacks" that allow you to use yield. https://twisted.org/documents/16.4.1/core/howto/defer-intro....

So twisted code can actually look like:

    @inlineCallbacks
    def doIt():
        responseBody = yield makeRequest("GET", "/users")
        returnValue(json.loads(responseBody))
iirc `returnValue` throws an exception of a specific type. It's ugly, but it's also the logical implementation of async on top of yield/generators.


This is so nostalgic. I actually met my cofounder on github due to a discussion on twisted vs gevent back in 2011. I had my inital code in twisted and he wrote the gevent piece. Fast forward 12 years and we still use gevent at http://plivo.com :)

Some of our initial code snippets:

# Twisted

def __protocolSendRaw(self, name, args=""): deferred = defer.Deferred() self.__EventQueue.append((name, deferred)) self.rawSend("%s %s" % (name, args)) return deferred

# Gevent

def _protocol_sendmsg(self, name, args=None, async=False): if self._closing_state: return Event() _async_res = gevent.event.AsyncResult() _uuid, event = _async_res.get() return event


Off topic: PLIVO, the norwegian term actually is a protocol used by critical services here. Thought you might find it interesting :)

> PLIVO (an abbreviation for ongoing life-threatening violence) is a procedure for cooperation between the police, the fire service, the rescue service and the healthcare system in incidents where life-threatening violence is perpetrated against several people.


Nice, did not know this.. Plivo in latvian means flying high, thats was one of the languages we named it based on.


damn i had memories of using plivo back in 2012 2013 2014


returnValue raises a specific exception that inlineCallbacks understands and translates to a normal return value. It was a hack that was only needed for python versions that couldn't have generators return non-None values. With modern versions, you don't need it anymore -- just use return.


My last job (2019 - 2022) was in part shepherding a 2013-era Python 2 Tornado-and-Motor (yes the async implementation of the MongoDB driver for Python) application into Python 3 and such modern niceties.

Absolutely fuck yield-raise based async implementations. Just remembering those induces psychic damage; implementing async as a massive hack on top of the language is impressive but also the kind of thing we're going to whisper to junior engineers as scary campfire stories for DECADES.


While inlineCallbacks exists, and indeed works, I would recommend not using it. I recall it not interacting with mypy particularly well, and in general being such a hack it had some annoying sharp edges. Explicit callbacks might be annoying but at least they are extremely clear on what's happening.


Nowadays twisted supports this syntax:

    async def doIt():
        responseBody = await makeRequest("GET", "/users")
        return json.loads(responseBody)
See https://patrick.cloke.us/posts/2021/06/11/converting-twisted...


Also, asyncio started like this as well.


I've been using gevent for 12+ years in a large codebase that uses it everwhere. It's wonderful, and I'm totally spoiled by how well it interoperates with code that wasn't designed to be async (especially library code that I don't want to refactor before using). There's no boilerplate. It takes no time to teach new people, and rarely do they even need to understand how async even works to write the correct code. I think it's a shame that asyncio became the defacto approach from cpython, because it would have been so much better for the ecosystem if cpython blessed gevent as the recommended approach. Still, gevent continues to be the best solution today and beyond.


agreed. There is an equivalent in Java - RxJava used extensively and android and somewhat on server. Everytime i use it, i get a lot of hate.

We use it in EdgeChains https://github.com/arakoodev/edgechains


Summary:

Concurrency has a lot to do with sharing one resource, and Python has dedicated tools to deal with that depending on the resource you must share.

If you have to share one CPU while waiting on the network, then the specialized tools for this are asyncio, twisted, trio, gevent, etc.

Asyncio is the current standard to do this, but tornado, gevent and twisted solved this problem more than a decade ago. While trio and curio are showing us what the future could look like.

But chances are, you should use none of them.


> But chances are, you should use none of them.

So, no event-oriented programming in Python, or by some other mechanism, or…?


From the conclusion of the article:

> If you need to get a few URLS fast, a ThreadPoolExecutor is likely the Pareto solution, 99% of the time.

I agree with this. ThreadPoolExecutor is easy to use. In general, using threads in Python to handle concurrent I/O is pretty simple and plenty performant.


It’s a waste if you need coordination between the tasks though. Putting an async task aside and sending only dumb synchronous function calls that never wait is more efficient than blocking a thread while it waits on another.

asyncio with an executor is a good way to run complex tasks in parallel.


I'm not sure whether this was your intention, but I think you are in full agreement with the article.


Or just use an async Http client with a little asyncio snippet. Depends on what you're comfortable with and what code you have running already.

If you're a junior and have no clue, your best bet is to ask a senior in your team though.


> So, no event-oriented programming in Python, or by some other mechanism, or…?

My guess he's saying that anything that introduces concurrency is hard, and best avoided if you can do so. He's not wrong. True event driven code (without these libraries) introduces a lot of unneeded concurrency. You see it at it's worst in older UI's. There a central event loop receives mouse, keyboard and io events from the OS. The event handlers it calls transforms those events into higher level events like "field changed" and "focus change", and re-injects them into the event loop, where they get turned into still higher level events like "modal dialogue closed" until the job is done. It's a prick of a way to write a program.

If you just want parallelism (as he defines it in the article), threads with their conventional stack model are far easier to use. The problem with conventional threads is that you also get unwanted concurrency and the heisenbugs that go along with that. But green threads (aka fibres, aka cooperative multitasking) eliminate that sort of non-determinism. Green threads are what effectively gevent provides. That's why he's so positive about it.

The other libraries he discusses aren't about event oriented programming either. Rather, they tame the event loop, allowing you to write code in a way that looks similar to threaded code. And they do it without introducing unwanted non-determinism. But they come at the cost of coloured methods, so you have 2 ways of doing everything. That is why asyncio has no ftp / smtp libraries - because they have to be written in the different colour. With green threads (and gevent) all that old code continues to work. That is why these libraries suck compared to green threads. But if you have a language like javascript that doesn't have threads, they are a huge leap forward over the raw event loop processing so you'd take it any day.

What has me scratching my head is why Python introduced these higher level event libraries at all. They own the language - they could have gone the green thread route. As for Rust introducing colored code - words fail me.


Really good write up. Been using gevent in pyinfra[1] for years and swear by it. Had some pains with setup, and am usually very wary of such magic, but it’s just really solid. Mostly write go these days though which has taken the shine off for sure!

Twisted, however, is a different beast. Have spent s decent chunk of time working on Matrix synapse homeserver[2], written in twisted, and oh my it just sucks.

[1] https://github.com/Fizzadar/pyinfra

[2] https://github.com/matrix-org/synapse


Eventlet (Linden Labs) actually owns the bar, then.

Edit

Green threads > Green threads in other languages: https://en.wikipedia.org/wiki/Green_thread#Green_threads_in_...

Coroutine > Comparison with > Threads, Generators: https://en.wikipedia.org/wiki/Coroutine#Implementations_for_... :

> Generators, also known as semicoroutines, [8] are a subset of coroutines. Specifically, while both can yield multiple times, suspending their execution and allowing re-entry at multiple entry points, they differ in coroutines' ability to control where execution continues immediately after they yield, while generators cannot, instead transferring control back to the generator's caller.[9] That is, since generators are primarily used to simplify the writing of iterators, the yield statement in a generator does not specify a coroutine to jump to, but rather passes a value back to a parent routine.

> (However, it is still possible to implement coroutines on top of a generator facility)

Asynchronous I/O > Forms > Light-weight processes or threads: https://en.wikipedia.org/wiki/Asynchronous_I/O#Light-weight_...

Async/Await > History, Benefits and criticisms: https://en.wikipedia.org/wiki/Async/await


> That is, since generators are primarily used to simplify the writing of iterators, the yield statement in a generator does not specify a coroutine to jump to, but rather passes a value back to a parent routine.

Another term for this asymmetric coroutine, as opposed to symmetric coroutine. Coroutines in Lua are asymmetric, but they're not generally referred to as generators as they're much more capable than what are called generators in other languages, like Python. This is largely because Lua's coroutines are stackful rather than stackless, which is an orthogonal, more pertinent dimension when implementing concurrency frameworks.

You can implement symmetric coroutines using asymmetric coroutines, and vice versa, by implementing a higher-level library that implements one using the other. So in principle they have equivalent expressive power, formally speaking. Ultimately the distinction between these terms, including generator, comes down to implementation details and your objective. And just because a language name-drops one of these terms doesn't mean what they provide will be as useful or convenient in practice as a similarly named feature in another language. Other dimensions--stack semantics, type system integration, etc--can easily prove the determining factor in how useful they are.


Great summary.

This paper from one of the Lua creators digs into this in the paper Revisiting Coroutines. It is a very readable paper. https://www.inf.puc-rio.br/~roberto/docs/MCC15-04.pdf


PROMPT: Generate a table of per- process/thread/greenthread/coroutine costs in terms of stack/less/heap overhead, CPU cache thrashing impact, and compatibility with various OS schedulers that are or are not preferred to eBPF

Is it necessary to use a library like trio for nonblocking io in lua, or are the stdlib methods all nonblocking with big-O complexity as URIs in the docstrings?

Trio docs > notes on async generators: https://trio.readthedocs.io/en/stable/reference-core.html#no...


Can you please rewrite this in English?


IIRC the history of the async things in TLA in order: Twisted (callbacks), Eventlet (for Second Life by Linden Labs), tornado, gevent; gunicorn, Python 3.5+ asyncio

[ tornado (FriendFeed, IPython Notebook (ZeroMQ (libzmq)),), Sanic (asyncio), fastapi, Django Channels (ASGI), ASGI: Asynchronous Server Gateway Interface, django-ninja, uvicorn (uvloop (libuv from nodejs)), ]

The Async/await keywords were in F# (2007), then C# (2011), Haskell (2012), ... Python (2015), and JS/ES ECMAScript (2017) FWICS from the wikipedia article.

When we talk about concurrency and parallelism, what are the different ~async patterns and language features?

Processes, Threads, "Green Threads", 'generator coroutines'

Necessarily, we attempt to define such terms but the implementations of the now more specifically-named software patterns have different interpretations of same, so what does Wikipedia have or teach on this is worth the time.


Uvicorn docs > Deployment > Gunicorn:

https://www.uvicorn.org/deployment/#gunicorn :

> The following will start Gunicorn with four worker processes:

  gunicorn -w 4 -k uvicorn.workers.UvicornWorker
> The UvicornWorker implementation uses the uvloop and httptools implementations.

PROMPT: Generate a minimal ASGI app and run it with Gunicorn+Uvicorn. What is uvloop?

PROMPT: How does uvloop compare to libzmq and eBPF? Which asynchronous patterns do they support?

PROMPT: Which Uvicorn (security) http headers work with which k8s Ingress pod YAML attributes and kubectl?

PROMPT: (Generate an Ansible Role and Playbook to) Host an ASGI webapp with Kubernetes (k8s) Ingress maybe with ~k3d/microshift locally


If you want to write the history of Python async things in order, you'll need to put Twisted in many more of the places: it was callbacks only in the very beginning, then worked via generators, and once Python added async syntax used that. "Twisted (callbacks)" is a simplification that ignores the influence Twisted has had even on non-Python ecosystems; Deferreds have been imitated all over the place.


Before callbacks, there were function pointers and there was no Garbage Collection; And before function pointers, there were JMPs, trampolines, and labels in ASM.

I don't share any reverence for the Twisted callback patterns that AJAX also implements. And, the article isn't about callbacks.

Promises in JS have a separate success and error functions. https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guid...

MDN > Async function: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... :

> The async function declaration creates a binding of a new async function to a given name.

> The await keyword is permitted within the function body, enabling asynchronous, promise-based behavior to be written in a cleaner style and avoiding the need to explicitly configure promise chains


I struggle to call fastapi a microframework, or even bundle it in the same category as flask. Starlette, the actually small framework at the core of fastapi, fits the description better and would have deserved a mention imho.


But its also not a framework as it offers barely any facilities for mid+ size projects, which immediately start to crack under the weight of trying to reimplement Django


Is it true that these frameworks only work with async compatible libraries? Will a typical pypi lib work with them? For example, a css or xpath parsing library, or date time library - do I have to worry about those blocking and not working properly with asyncio or the others? Or only network and file reading libs?

I wish the article had spent more time on this. Without more info I would probably just use what the author mentions at the end as the Pareto ideal solution (ThreadPoolExecutor) because usually async frameworks in not historically async languages end up being islands within the larger community that need their own bespoke libraries.


You don't use IO/make any external calls when you are parsing, or doing date time, so they don't have an opportunity to "cooperate" by letting another thread execute while they wait for the IO to finish.


Thanks. So even chewy compute operations (parsing a big html page then running a complicated xpath against it) are ok? What would be an example of blocking - is it just network io? Presumably file io too?

Update: I think I’m conflating a bit what I want to speed up with what is allowed. Presumably heavy compute stuff is perfectly compatible with asyncio but it won’t speed it up - it would speed up a lot of io operations. ThreadPoolExecutor can speed up heavy compute by parallelizing (if it’s heavy enough) but may be overkill for just downloading 20 web pages at once.


The basic idea is that file IO and socket IO will allow a task to be suspended while the IO is happening, and let another task do its work.

On top of that, you can write a C extension that will let you run compute heavy work in a way that allows your task to be suspended (for example, the C extension spins up its own thread). This extension has to be "very careful", basically by avoiding touching Python-side data during this work.

The way this sort of stuff ends up working is you pass data into a C extension, and that extension takes ownership of the data or copies it or whatever, does what it needs, then gives Python back some result.

But if you're just pure-Python compute heavy, then your task won't be suspended. So everything will run, but your compute-heavy stuff will hog the CPU, and won't be suspended. (Though if you have compute heavy work that is, like, looping over data, you could add `sleep(0)` between every couple of iterations. This gives other tasks a chance to run! This could be good enough to prevent weird bottlenecks).

But the ultimate thing is if you have N compute-heavy tasks, you probably won't get speed advantages. If you have 1 compute-heavy task and N IO-heavy tasks, you can get advantages (even if the IO-heavy stuff is interspersed). But if you have N compute-heavy tasks and not much IO-heavy tasks, multiprocessing can get you where you want (since it's usually IO-heavy stuff that is helped out the most with async/await)


If you are parsing a big thing of text, there is no opportunity for your thread to pause the CPU and say "ok other threads, I have to wait on some external data" and then hand over the CPU resources.

But if you are trying to make a network request, disk or memory access, or any HID stuff, you might have to wait on that thing to do it's job and get back to you. At this point you can tell your thread to raise it's hand and say "hey, I don't know how long, but I know I need to wait for this thing to finish, so while I wait someone else can use the CPU, but when I am done waiting, I will need the CPU back." This is the core idea of "cooperative threads" or green threads or co-routines or any sort of thread that is at the language runtime level and not at the OS level.

https://en.wikipedia.org/wiki/Cooperative_multitasking

So consider if the task you are doing is asking for data from "somewhere external to my thread" then you might be able to make that call non-blocking to enable more throughput.

Hope that helps explain it!


CPU-heavy operations are allowed, strictly speaking, they're just "impolite" and may cause subtle problems.

for example, say you have an aiohttp-backed web server, and in the handler for some URL route, you do a CPU-bound computation that takes 5 full seconds to execute.

a crucial thing to remember is that asyncio programs, by default, are still single-threaded. so for those 5 seconds, your computation is the only thing running. other event loop tasks are unable to run. the loop is "blocked" in the same way it would be as if you called `time.sleep()` (not `await asyncio.sleep()`, but the regular synchronous sleep method) or made a blocking IO call (in a pathological case, you might open and read a file on a remote NFS share mounted over a WAN link, for example)

now, suppose your web server exposes an `/alive` healthcheck endpoint. during that 5-second interval where you're hogging both the CPU and the event loop, aiohttp won't be able to dispatch requests to the healthcheck endpoint. if your load balancer has a 3-second timeout for those healthchecks, from its perspective the service will appear to be flapping between online and offline, even though the service itself never actually goes down.

the "polite" thing to do, for this sort of big CPU-bound task, is to hand them off to a ThreadPoolExecutor like you said (or ProcessPoolExecutor if you have GIL concerns) with `loop.run_in_executor` [0]. that gives you a future, and when you `await` the future you are politely yielding the event loop to allow other tasks to run, such as those requests to the healthcheck endpoint.

0: https://docs.python.org/3/library/asyncio-eventloop.html#asy...


> it would speed up a lot of io operations

It doesn't speed up the I/O operation, but allows the CPU to continue executing other code rather than blocking and waiting (and the program becoming unresponsive).

If your app is querying a database ten times per user request and averaging 5-10ms per query, your CPU is spending 50-100ms doing nothing but waiting on the network to finish so it can resume executing the code that comes next. That's time other code that has CPU instructions that can execute now would benefit from.


asyncio yes. gevent no. if you patch at an entrypoint you'll be ok 99 times out of 100.


Here's the said black magic (for amd64/unix) that the gevent project relies on:

https://github.com/python-greenlet/greenlet/blob/master/src/...

I still think it's too much for a Python project -- if you think you need this in 2023, maybe you need to reevaluate? I'd seriously have another look at Go.


All greenlet does is allow the Python frames to managed separately from the C call stack - it is very similar to what Pypy is also doing. I don't think your characterization is at all accurate.


Sorry I wasn't clear -- my point was that a typical python programmer won't be able to troubleshoot these bits -- it'd be way outside their skill set.

Before anyone jumps in to say nobody ever has/will ever have to troubleshoot this because it's so stable and used everywhere and so forth, please have a look at the comments at the top of that file. This code had to be patched eg. due to a compiler update.

Remember, this is open source and this code is offered without any guarantees. And non-existing guarantees of greenlet/gevent is quite different from the non-existing guarantees of CPython (no idea about PyPy).

Implementing cooperative multitasking at this level is way off the beaten path and should only be attempted by those who know what they are doing.

Today, I'm sure this code works perfectly on linux 6.4.7 when compiled by gcc 12.3.1 against python 3.10 on amd64 architecture in release mode. But can you guarantee it will it work on linux 7.8.17 when compiled by gcc 15.2 against python 3.21 on AARCH64 10.4 architecture? No you can't. If you are lucky, it'll just die with a general protection fault or something. But it can also break in subtle ways that only a seasoned systems programmer can realize, let alone fix.

So if you are a professional, you care about maintainability, you care about forward compatibility, you are starting a greenfield project, you are sure you need cooperative multitasking and you ask nicely my opinion about which async framework you should choose, I would strongly advise you to steer clear of gevent/greenlet based libraries.

Or, you know, just hack away :)


That's surprisingly readable and understandable. Nice work by the authors for encapsulating it so well.


Really surprised by the positive comments here.

The article provided a grocery list of concurrency primitives which IMO: was designed to paint the misleading picture that concurrency in Python is too hard (when everyone just uses the excellent and standard asyncio library in practice.) Then the article ends with the faulty conclusion not to use any concurrency primitive. (Again, why? Where is the logic in this conclusion?) Python's asyncio library handles both CPU-bound work and I/O bound work. It has data structures for queues, call backs, push and pull style usage, sleep, 'background' execution, future results, and much more. It's very well designed and tested.

The article vaguely alludes to async tasks being hard to debug because their execution order is obscured. But... that's kind of the point of using them. Programs written to use this logic flow don't have to be weighed down waiting for every result. It's a huge benefit in Python land where you have a GIL on a core and it's able to switch between multiple tasks without blocking the others for no reason. The article didn't provide any reason why you would want to consider other options to asyncio which makes their inclusion seem like more of a flex than anything.

I'm wondering now if most of the commenters here were just OPs friends shilling their support for something they don't understand.


I have worked a lot with async libraries in the past. Twisted was very powerful, but the documentation is a mess and the risk of writing spaghetti code is very high. IMHO the best async libraries are trio/curio, you can learn how to use in just couple of hours.

Anyway, in 2023, if I need something that rely heavily on async I/O, I will use go without any doubts. One of my side projects is basically a scraper: I started from normal Python, then multithread, then asyncio, then trio and at the end I ported to go+colly with a lot of benefits (speeds and easily of maintenance).


I picked gevent for a greenfield backend (in IoT/ML space) back in 2019. No regrets. The main contender was asyncio, which was the hot-new-thing at the time (still is, I guess). At the time, I did not feel that the ecosystem and programming practices for asyncio were good enough. That has of course matured a lot now, so it is no longer an issue. But I would actually still consider gevent again for a new project today. Possibly the asyncio web frameworks (ie FastAPI) have the edge over gevent-compatible (ie Flask) though, which might tip the scales.

I remember using Twisted some 10 years ago. Never again...


gevent is the way to go these days tbh. asyncio is close but the dependency on async-aware libraries can be problematic.

surprised the author ended on the note that none of these are relevant - i use a lot of these tools on a weekly basis.


Great write up on something I depend on a lot but don't understand well.

This also answers something I've always wondered about which is how Twisted and Tornado fit into the Python web framework landscape and whether I should use them. Tornado always seemed popular but slow and less intuitive than Flask/Django. And then Twisted was a far lower level library but people were still building APIs on it.

Where does Eventlet fit into the picture? Is it a similar box of magic monkey patches like GEvent?


>They will not make two calculations at the same time (can't use several CPU cores like with multiprocessing)

Not true. Future executors can run in processors pool (with different cores) or using threads (same core, different event loops.)

>You have to understand that async programming is hard, and no matter how good the tooling is, it's going to make your code more difficult to manage. It has a high price.

Dude starts out with an article that lists the entire kitchen sink for concurrency; Neglecting to mention that literally none of these libraries are even considered by Python devs. Then concludes that writing async code is too hard (writing 'await' before a call is too hard.)

To me this entire article reads like someone's attempt to over-complicate things by trying to overwhelm you with information to seem impressive even though 99% of it is irrelevant to the discussion. Then makes a non-sequitur conclusion to make it seem like only 'experts' ought to touch any of this because [for you] (le simple-minded pleb) you will inevitably break everything. Lmao, fuck that, that's condescending as hell. Literally just using async def for functions that do IO and write await. There's a few things to learn but any dev is capable of doing it.

>I’ll add you probably should not go with asyncio manually. Use a higher level asyncio based lib, or better, framework. Async is hard enough as it is.

Horrible advice. Asyncio is high level already and comes standard in Python 3.6 >=. You will get far more trouble using third-party concurrency libraries than trying to use the Python library which is actually very elegant and well-supported. Twisted sucks, by the way.


> Literally just using async def for functions that do IO and write await.

If only it were that simple.

> You will get far more trouble using third-party concurrency libraries than trying to use the Python library which is actually very elegant

Again, you overlook the fact that asyncio requires you to explicitly launch a concurrent context. Asyncio is fine, but it is not "elegant" because of this.


> asyncio requires you to explicitly launch a concurrent context

`asyncio.run()` [0] was added in 3.7, and as far as I know is the current recommended way to do this.

as shown in the asyncio "hello world" example [1], you can write your main() function to be async, and then the entire process runs asynchronously from the beginning.

and as shown on that page, you can also use `python -m asyncio` to get a REPL with a running event loop where using `await` will work as you'd expect it to.

one of my biggest quibbles with a lot of asyncio demo code snippets is the use of `loop.run_until_complete()` [2] and similar lower-level primitives. in most asyncio applications you simply want to start the loop as early as possible with `asyncio.run(main())`, let it run for the duration, and have the entire process exist in async-land.

0: https://docs.python.org/3/library/asyncio-runner.html#asynci...

1: https://docs.python.org/3/library/asyncio.html

2: https://stackoverflow.com/questions/40143289/why-do-most-asy...


Yes, these are some ways to get that context. But it can be difficult to do correctly for brownfield projects or libraries, especially if you need to support both sync and async. Not elegant!


> especially if you need to support both sync and async

I don't think there's any possible way to do that elegantly.

maybe gevent-style monkeypatching makes it look elegant, but there's a whole pile of inelegant stuff happening under the covers to present that illusion.

and I would say that almost by definition, if you're monkeypatching the stdlib, you are banished from the realm of elegance.


Took me a while to figure out that real content of the website hides under whole page covering popup.


Damn, I explicitly disabled the subscribe wall in the settings. WTF?


I have always had a hard time with this parallel / concurrent distinction. I think it’s because of how many layers of abstraction the machine has. Many threads per core. Many cores per package. Multiple packages on a motherboard.

Is there a practical distinction? It feels like there isn’t. It’s not as if a two CPU machine can run two Python processes “in parallel” as opposed to “concurrently”. I have 100 other processes running too. It’s all concurrency, and never parallel.

Perhaps the author and others are distinguishing between concurrency that can happen at the granularity of any instruction (e.g. threading) compared to cooperative yielding that only happens when await is called (e.g. asyncio.)


> Is there a practical distinction?

I’m not sure if this will help, but I think of it like - parallelism is running the same task in multiple threads / on multiple cores at once. And concurrency is “concurrently” running different tasks on one core.

Nodejs is a purely single threaded runtime. Let’s say you have a web server in node processing http requests.

- Your program implements concurrency by yielding back to the event loop instead of blocking. Request 1 comes in, and then sends a query to the database. While we wait for the database to respond to the query, we can process request 2.

- You usually implement parallelism with nodejs by running multiple instances of your nodejs server process across all your cores. (Then use a program like nginx to load balance requests across all the node processes.)

The GPU might be a better example. When rendering a frame, the graphics card runs a small program (called a fragment shader) for every pixel on the screen. A 4090 has ~16000 cores which can all run the same fragment shader on different pixels at the same time. This is called parallelism, not concurrency. Concurrency would be if you had two video games open at once and your graphics card was quickly swapping between rendering frames for game A and game B.


> It’s not as if a two CPU machine can run two Python processes “in parallel” as opposed to “concurrently”. I have 100 other processes running too. It’s all concurrency, and never parallel.

If they're both CPU-bound they probably can. Most of the time those 100 other processes are probably sleeping and not actually using the CPU (unless you're at 100% CPU usage all the time? My computer tends to sit at around 5-15% usage)


My view: “concurrent” just means that there can be a point in time when two tasks/functions/coroutines/whatever have started but haven’t finished. “Parallel” means that they actually run at the same time, e.g. on different CPU cores. Concurrent is more general.


> The only danger is if you call gevent.monkey.patch_all() too late.

Gevent takes over, often making advanced asyncio not work on gevent programs. Patching things globally should obviously be seen as dangerous in itself.


That cashier analogy seems quite bad. You're looking for an example of cooperative multitasking and you're taking customers waiting in a line, the least cooperative group of people in the world? Also, I have never seen a store setup where a cashier is shared between two lines and is handling the other line while they are waiting for your payment to go through.

I feel like car sharing services are a better analogy. If you're not using the car, someone else can.


I don’t fully disagree, but I get that the author just wanted something straightforward and oversimplified because the scope of the article was something else.


Fantastic writeup, and the code examples go a long way to making this comprehensive article easy to consume.

That being said, I disagree with this: "I strongly believe beginners should start their first serious project with django and not flask, despite the fact most people see it the other way around"

Given the dearth of django content for beginners, flask seems much easier to learn. This is based on my own work over the last year, so YMMV.


> You have to understand that async programming is hard, and no matter how good the tooling is, it's going to make your code more difficult to manage. It has a high price.

Is it unfashionable to say that NodeJS got this right with > v7.6? It's always felt pretty straightforward to me but perhaps it just suits my mindset. It's only when I try to do this in other languages (e.g. Python or Ruby) that it starts to feel clunky.


I'm the first one to hate on JS, but the whole integrated event loop, with auto scheduled async tasks and Promise system that works both with callbacks or async is one of the easiest system for http calls I ever used.


I love how I read the setup, then clicked on the comments and got the punchline delivered perfectly by Hacker News:

“We're having some trouble serving your request. Sorry!”


What a good, clue-full post! I hope there is a trio-like future for async. And I really wish python had not added it to the standard library - it’s tricky, complex, and doesn’t play well with the rest of the ecosystem. It’s really a fun challenge to use, and better than reasoning about threads but definitely not ready for mass usage.


I haven't used gEvent, but while asyncio doesn't completely burn the code-base I've met projects that were built on tornado and for which it was disastrous.

What I really like about asyncio is that there isn't much of deps or forced loop in order to get it working.

But I could be mislead and just not know how it is achieved in tornado, does gEvent burn the codebase as well ?


Tornado evolved extremely nicely. Gone are the days of needing callbacks or decorators. Now it's just a matter of putting async in front of def and there's your asynchronous handler.


Honestly asyncio is good and you shouldn't bother with twisted, tornato or gevent ever unless you inherit an existing code base. I like trio features but code quickly turns super verbose ang ugly with the abdundance of with statements - maybe with it's more bearable with a bit of syntax sugar.


I personally really loved to use Twisted back in a day. it did take same time to figure out what exactly "inline callbacks" did, but after this it was easy.

I still miss Perspective Broker and Producers/Consumers. Those things were rather powerful.


I still have nightmares about Twisted’s documentation.


Kids these days will never know the struggle of having to search for libraries that were not just in your language, but also could interop with your inane monkey-patching-based concurrency model of choice. Python used to be an absolutely horrendous language for anything involving concurrency, nearly as bad as ruby. Anyways, long live asyncio.


ever look at a twisted stack trace?


And worse, when you do, something loop back.


How is FastAPI in large projects?


It's not fast that's for sure.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: