I had investigated what explained the huge gap between the few fastest frameworks vs the rest.
The answer is a deceptive one, they didn't achieve revolutionary optimizations.
The thing is: on many of those benchmarcks, the bottleneck obviously is the DB.
The ability to do DB queries asynchronously and with batching is the differentiating factor.
Only postrgresql can support such a feature but you need support on the postrgresql client too.
The official C postrgresql client used everywhere does not support said feature except with a patch from 2016.
Yes the secret of drogon (and probably of Lithium) is that they use a fork of libpq from 2016 because upstream can't agree on merging the patch and nobody is working on upstreaming it.
Actix web benefited from the feature because their client tokio-postgres is a reimplementation and does not use libpq.
The industry grade server ecosystem that is the JVM use the jdbc which sadly has a blocking socket thus not allowing asynchronicty.
But when loom arrive every jdbc existing code will magically, automatically become truly asynchronous such spring should come on the top 4 place.
There is also a wrapper of the jdbc through kotlin coroutines and there are reactive jdbc implementations such as R2DBC. It is unclear as of today if such solution enable postrgresql async queries and batch processing. It seems that nobody has tried those on TechEmpowerUp which is sad.
Finally one could use libpq over JNI.
Edit: I have read that the next release of pgjdbc (43) will switch from std socket to the NIO non blocking socket.
What should be heuristically the fastest HTTP framework (H2O, in C) has refused to use the old libpq fork because the api is not stable and thus not production grade.
Indeed, sql query batching is the key (and I personally think that the techempower website should make it explicit), async communication with the db and the http client is also the key.
FYI lithium is not using a libpq fork from 2016, it has been rebased on master 1 month ago. but yes Lithium (the sql-pipeline branch) is using it. (I mailed the drogon maintainer to update it aswell).
About H2O, the non batched version of lithium is as fast (slighly faster on some tests, slighly slower on others), while being much simpler to use (implementation of TFB of H20 is 4400 lines vs 250 lines for lithium...)
Interesting, thanks for the answer.
What I would really like to see is the obligation to have in the name "Lithium-with_batch" and a version "Lithium" that would be without it. It would show the impact of other optimizations which are currently hidden by the paradigm shift.
I added -pipeline but I guess -with-batch is more explicit. I'll change it then. But anyway the techempower team will eventually add a special tag for batching. There is actually a version of lithium not using batching, check for 'lithium-postgres' in the benchmark tabs other than 'composite score'
Other big optimizations are
- non blocking communications between the database and the framework
- non blocking communication between the http client and the framework
This is for C++, for other slower interpreted languages: calling C bindings is a big optimization aswell (and this is how some php frameworks get good performances for example).
Thanks. Asynchronocity is one thing but there is one thing that in theory can achieve better performance than asynchronicity alone: I'm referring to the reactive programming paradigm which benefit from the concept of backpressure.
If you've heard of it, do you think it could achieve even better performance?
https://medium.com/@jayphelps/backpressure-explained-the-flo...
I had a look at the video, but I don't think there is backpresure in this benchmark since there is only 512 connections max and each connections wait for the server response before sending a new request. So in other words the load generator never send more request/s that the server is able to handle. (tell me if I missunderstood backpressure)
Mmh then I wonder if the techempowerUp benchmark suite would benefit from a new benchmarck that is backpressure sensitive, thus increasing its coverage of real world workloads?
yes, it has been discused already to increase the number of connections. But still, several thousands of simultaneous connections is still ok for the best performing frameworks...
> But when loom arrive every jdbc existing code will magically, automatically become truly asynchronous such spring should come on the top 4 place.
I know less than zero about Java and Loom, but if I understand correctly loom enables first class continuation and seamless M:N threading.
But I do not see how that will help jdbc, (unless you have tens of thousands of db sessions); I understand that the power of the new libpq is the ability to pipeline queries and that won't magically appear without development effort.
I'm not an expert but I believe this would not allow batching, only pipelineing and to a limited extent as spawning OS thread is "slow" and by default I doubt they spawn more threads than CPU number * 2?
I want to approach it with the same level of polite humility you've offered. Presumably you could use a thread pool proportional to the number of connections in your connection pool?
The industry grade server ecosystem that is the JVM use the jdbc which sadly has a blocking socket thus not allowing asynchronicty. But when loom arrive every jdbc existing code will magically, automatically become truly asynchronous such spring should come on the top 4 place.
There is also a wrapper of the jdbc through kotlin coroutines and there are reactive jdbc implementations such as R2DBC. It is unclear as of today if such solution enable postrgresql async queries and batch processing. It seems that nobody has tried those on TechEmpowerUp which is sad. Finally one could use libpq over JNI. Edit: I have read that the next release of pgjdbc (43) will switch from std socket to the NIO non blocking socket.
What should be heuristically the fastest HTTP framework (H2O, in C) has refused to use the old libpq fork because the api is not stable and thus not production grade.