It's so funny how there is almost no 'science' - or 'engineering' - in modern 'computer science' or 'software engineering'. The finding on OP's website, of what is or isn't fast for what purpose, should not be surprising us, 79 years after the first programmable computer. Yet we go about our work, blissfully ignorant of what the actual capabilities of what we're doing are.
We don't have hypotheses, experiments, and results published, of what a given computing system X, made up of Y, does or doesn't achieve. There are certainly research papers, algorithms and proof-of-concepts, but (afaict) no scientific evidence for most of the practices we follow and results we get.
We don't have engineering specifications or tolerances for what a given thing can do. We don't have calculations for how to estimate, given X computing power, and Y system or algorithm, how much Z work it can do. We don't even have institutional knowledge of all the problems a given engineering effort faces, and how to avoid those problems. When we do have institutional knowledge, it's in books from 4 decades ago, that nobody reads, and everyone makes the same mistakes again and again, because there is no institutional way to hold people to account to avoid these problems.
What we do have, is some tool someone made, that then millions of dollars is poured into using, without any realistic idea whatsoever what the result is going to be. We hope that we get what we want out of it once we're done building something with it. Like building a bridge over a river and hoping it can handle the traffic.
There are two reasons creating software will never (in my lifetime) be considered an engineering discipline:
1) There are (practically) no consequences for bad software.
2) The rate of change is too high to introduce true software development standards.
Modern engineering best practice is "follow the standards". The standards were developed in blood -- people were either injured or killed, so the standard was developed to make sure it didn't happen again. In today's society, no software defects (except maybe aircraft and medical devices) are considered severe enough for anyone to call for the creation and enforcement of standards. Even Teslas full-self-driving themselves into parked fire trucks and killing the occupants doesn't seem enough.
Engineers that design buildings and bridges also have an advantage not available to computers: physics doesn't change, at least not at scales and rates that matter. When you have a stable foundation it is far easier to develop engineering standards on that foundation. Programmers have no such luxury. Computers have only been around for less than 100 years, and the rate of change is so high in terms of architecture and capabilities that we are constantly having to learn "new physics" every few years.
Even when we do standardize (e.g. x86 ISA) there is always something bubbling in research labs or locked behind NDAs that is ready to overthrow that standard and force a generation of programmers into obsolescence so quickly there is no opportunity to realistically convey a "software engineering culture" from one generation to the next.
I look forward to the day when the churn slows down enough that a true engineering culture can develop.
Imagine what scenario we would be in if they laid down the Standards of Software Engineering (tm) 20 years ago. Most of us would likely be chafing against guidelines that make our lives much worse for negative benefit.
In 20 years we'll have a much better idea of how to write good software under economic constraints. Many things we try to nail down today will only get in the way of future advancements.
My hope is that we're starting to get close though. After all, 'general purpose' languages seem to be converging on ML* style features.
* - think standard ML not machine learning. Static types, limited inference, algebraic data types, pattern matching, no null, lambdas, etc.
The Mythical Man-Month came out in 1975. It was written after the development of OS/360, which was released in 1966. Of the many now-universally-acknowledged truths about software development contained in that book, No Silver Bullet encapsulates why "in 20 years" we will still not have a better idea:
There is no single development, in either technology or management technique,
which by itself promises even one order of magnitude improvement within a decade
in productivity, in reliability, in simplicity."
I like to over-simplify that quote down to:
Humans are too stupid to write software any better than they do now.
We have been writing software for 70 years and the real world outcomes have not gotten a lot better than when we started. There are improvements in how the software is developed, but the end result is still unpredictable. Without thorough quality control - which is often disdained, and there is no requirement to perform - the result is often indistinguishable whether it was created by geniuses or amateurs.
That's why I would much rather have "chafing guidelines" that control the morass, than to continue to wade through it and get deeper and deeper. If we can't make it "better", we can at least make it more predictable, and control for the many, many, many problems that we keep repeating over and over as if they're somehow new to us after 70 years.
"Guidelines" can't stop researchers from exploring new engineering materials and techniques. Just having standard measures, practices, and guidelines, does not stop the advancement of true science. But it does improve the real-world practice of engineering, and provides more reliable outcomes. This was the reason professional engineering was created, and why it is still used today.
"It's so funny how there is almost no 'science' - or 'engineering' - in modern 'computer science' or 'software engineering'"
It may not have been clear in 2014, but it is now: Data scientists are not computer scientists or software engineers. So tarring software engineers with data scientists practices is really a low blow. Not that we're perfect by any means, but that data point you're drawing a line through isn't even on the graph you're trying to draw.
I was unlucky enough to brush that world about a year ago. I am grateful I bounced off of it. It was surreal how much infrastructure data science has put into place just to deal with their mistake of choosing Python as their fundamental language. They're so excited about the frameworks being developed over years to do streaming of things that a "real" compiled language can either easily do on a single node, or could easily stream. They simply couldn't process the idea that I was not excited about porting all my code to their streaming platform because my code was already better than that platform. A constant battle with them assuming I just must not Get It and must just not understand how awesome their new platforms were, and me trying to explain how much of a downgrade it was for me.
"We don't have calculations for how to estimate, given X computing power, and Y system or algorithm, how much Z work it can do."
Yeah, we do, actually. I use this sort of stuff all the time. Anyone who works competently at scale does, it's a basic necessity for such things. Part of the mismatch I had with the data scientists was precisely that I had this information and not only did they not, they couldn't even process that it does exist and basically seemed to assume I must just be lying about my code's performance. It just won't take the form you expect. It's not textbooks. It can't be textbooks. But that's not the criterion of whether such data exists.
We do actually have some methods of calculating an expected performance. For instance we know that a Zen4 CPU can do 4 256 bit operations per clock, with some restrictions on what combinations are allowed. We are never going to hit 4 outright in real code, but 3.5 is a realistic target for well optimised code. We can use 1 instruction to detect newline characters within those 32 bytes, then a few more to find the exact location, then a couple to determine if the line is a result, and a few more to extract that result. Given a high density of newlines this will mean something on the order of 10 instructions per 32 B block searched. Multiply the numbers and we expect to process approximately 11 B per clock cycle. On a 5 GHz CPU that would mean we would expect to be done in 32 ms, give or take. And the data would of course need to be in memory already for this time to be feasible, as loading it from disk takes appreciably longer.
Of course you have to spend some effort to actually get code this fast, and that probably isn't worth it for the one-shot job. But jobs like compression, video codecs, cryptography and that newfangled AI stuff all have experts that write code in this manner, for generally good reasons, and they can all ballpark how a job like this can be solved in a close to optimal fashion.
The way I see it is that we're in an era analogous to what came immediately after alchemy. We're all busy building up phlogiston like theories that will disprove themselves in a decade or two.
But this is better than where we just came from. Not that long ago, you would build software by getting a bunch of wizards together in a basement and hope they produce something that you can sell.
If things feel worse (I hope) that's because the rest of us muggles aren't as good as the wizards that came before us. But at least we're working in a somewhat tractable fashion.
The mathematical frameworks for construction were first laid out ~1500s (iirc). And people had been doing it since time immemorial. The mathematics for computation started about 1920-30s. And there's currently no mathematics for the comprehensibility of blocks of code. [Sure there's cyclomatic complexity and Weyuker's 9 properties, but I've got zero confidence in either of them. For example, neither of them account for variable names, so a program with well named variables is just as 'comprehensible' as a program with names composed of 500MB of random characters. Similarly, some studies indicate that CC has worse predictive power of the presence of defects than lines of code. And from what I've seen in Weyuker, they haven't shown that there's any reason to assume that their output is predictive of anything useful.]
It can be, but usually isn't. Similarly, dropping a feather and a bowling ball simultaneously might be science, or might not be. Did I make observations? Or am I just delivering some things to my friend at the bottom?
We don't have hypotheses, experiments, and results published, of what a given computing system X, made up of Y, does or doesn't achieve. There are certainly research papers, algorithms and proof-of-concepts, but (afaict) no scientific evidence for most of the practices we follow and results we get.
We don't have engineering specifications or tolerances for what a given thing can do. We don't have calculations for how to estimate, given X computing power, and Y system or algorithm, how much Z work it can do. We don't even have institutional knowledge of all the problems a given engineering effort faces, and how to avoid those problems. When we do have institutional knowledge, it's in books from 4 decades ago, that nobody reads, and everyone makes the same mistakes again and again, because there is no institutional way to hold people to account to avoid these problems.
What we do have, is some tool someone made, that then millions of dollars is poured into using, without any realistic idea whatsoever what the result is going to be. We hope that we get what we want out of it once we're done building something with it. Like building a bridge over a river and hoping it can handle the traffic.