I don't think the criticism is outdated, in spite of what you mention. Indeed, in the last few decades, releasing the code has gone from uncommon to standard practice. But in my view, both you and the post author are greatly overstating the importance of releasing the code. Sure, releasing it is clearly better than not releasing it, it's a step forward. But it makes very little difference to peer review, if at all, since peer reviewers hardly ever actually open the code. You already are lucky if you can find reviewers that read the whole paper rather than skimming through the dense central sections - good luck finding reviewers that will actually check the code, considering that it takes much more time and effort than reading a paper. When I'm myself a reviewer, I also penalize papers with no code - but papers with code might as well link the Minesweeper source and most of the time I wouldn't notice, because I don't have time to check - and I devote more time per review than most people I know.
Regarding your second paragraph, that is what peer review actually is in the present time, but the point is that it's supposed to be much more than that. And the problem is not only outright fraud that might have severe consequences, the post itself mentions "soft fraud" that will hardly ever have any consequence (who is going to notice that someone made several experiments and only published the one favoring their conclusion?). And there are also problems not related to fraud - for example, peer review is supposed to be not only about correctness but choosing the best work for top-tier venues, good but not that impactful work for second tier, etc. Shallow peer review often means that work that is not fraudulent, but not stellar is accepted in better venues due to famous authors/institutions or just authors that are especially skilled in "pitching" their work, and vice versa.
In practice fraud and file-drawer effects observed (estimated from all the large N replication projects that have sprung up) are much less common than what the replication crisis headlines imply, a lot of it seems to simply be that in practice a "null hypothesis" is better understood as a continuous distribution because the null model doesn't match reality perfectly. See "The replication crisis is not a crisis of false positives" Coleman et al 2024, https://osf.io/preprints/socarxiv/rkyf7
In my experience, getting someone else's code from a paper to run is a PITA. In life science, the person who wrote the code is often not really that experienced in this kind of thing, and certainly didn't have ease of sharing and running in mind when they started the project.
I've long had in the back of my mind some kind of cloud-based platform for writing, running, storing, and sharing scientific code. It would be the standard, and journals would (in time) ask that analysis be done using it, or that reasons are provided as to why it could not be used.
> I've long had in the back of my mind some kind of cloud-based platform for writing, running, storing, and sharing scientific code.
And now there are 15 competing standards.
Everyone uses the computing resources available at their institution, and each institution makes its own choices. Running someone else's code almost always involves migrating it to a new environment. Even using something as common as Docker is problematic, as it doesn't always interact nicely with the rest of the environment (such as Slurm).
I just don't really think peer review is that important... it doesn't need to be more thorough, you don't need people to double check every little thing or try to replicate things before publication. Good scientists are already skeptical, and will remain so about publications no matter how much peer review it goes through.
In the past, it mostly helped journals weed out papers that weren't interesting, to save print costs. Nowadays, as a scientist, I'd rather everything just get posted quickly on preprint servers, and make my own judgement. Reviewers won't ever look deeply, but you can bet another scientist planning several years of work based on someone else's paper will look really deeply.
IMO when I've had papers not published because of bad reviews, it was usually from competitors that were bitter about me beating them to something important, or using a different approach than they would... or inappropriate reviewers that weren't qualified to understand the paper. I don't think more aggressive gatekeeping by peers would do anyone any good, and should probably just be eliminated entirely.
Having the code is important because the code is part of the methods. Without the code, the work isn't always even possible to reproduce.
In a perfect world, studies should not only be required to include the code, any of the following should also be a reason to reject a publication:
* The code can't be run outside the original researcher's undocumented special snowflake system
* A fellow scientist who understands the domain can't understand what the code is doing and so is unable to verify that it isn't making any mistakes
* It'd be really nice if there were some sort of test cases that could verify that the code in question produced results in line with what the scope of the paper in question considers to be established science in that domain before it was applied to the experimental data - probably not always possible, but it'd be good to at least try
Yeah I know this doesn't touch the dozens of other reasons the whole institution is in trouble, and for those reasons usually nobody who has the skill and time to properly review the code actually reviews it carefully. For that matter, most scientists aren't very good at writing code that's easy to understand and will run everywhere. Just one more set of things that ought to go on the list of things that really should be changed.
Most scientists unfortunately lack the technical skill to make their code easy to run elsewhere... however I think it is getting better.
Someday I would like to see it a requirement that the code has to run anywhere with a single command, e.g. 'docker compose up' or such, and then automatically produce all of the figures and analysis in the paper. There have been some pushes to start journals where the articles themselves can't have any plots or numbers directly, but must automatically generate them from code on the journals servers, e.g. with something like R markdown. It hasn't caught on because the work and technical skill required is too much.
That's not the code from my point of view. That's the script that processes the output of the code. The actual code often needs nontrivial time and/or money to run, and you may need to spend some effort to match the resources you have with the resources the code needs.
The vast majority of papers, even computational ones can run in that way. If there are compute intensive steps, the same process can be done, with a step that includes the previous output, such that it only gets regenerated if the user manually deletes it. I usually do this with Makefiles.
Regarding your second paragraph, that is what peer review actually is in the present time, but the point is that it's supposed to be much more than that. And the problem is not only outright fraud that might have severe consequences, the post itself mentions "soft fraud" that will hardly ever have any consequence (who is going to notice that someone made several experiments and only published the one favoring their conclusion?). And there are also problems not related to fraud - for example, peer review is supposed to be not only about correctness but choosing the best work for top-tier venues, good but not that impactful work for second tier, etc. Shallow peer review often means that work that is not fraudulent, but not stellar is accepted in better venues due to famous authors/institutions or just authors that are especially skilled in "pitching" their work, and vice versa.