I have a hunch we'll eventually swing back when we find the limits of vibe coding--in that LLMs also can only hold so much complexity in their heads, even if it's an order of magnitude (or more) greater than ours. If we make it understandable for humans then it'll definitely be trivial for LLMs, which frees them up to do other things. I mean, they don't have infinite layers or units to capture concepts. So the more symmetrical, consistent, and fractal (composable) you can make your code, the easier time an LLM will have with it to solve problems.
LLM's context window limit already hits you in the nose when you have a big codebase and you ask it questions which make it read a lot of code. 200k is so easy to hit sometimes, especially when you only truly get to use 120k
If you're reading past the first sentence this time -- it is obvious, yes. So why use such language to describe the software? Your deliberate choice to use misleading language is not only obviously incorrect, but harmful.
The solution offered is pretty weak. I don't think it addresses why the internet took the shape that it did. Publishing without centralized services is too much work for people. And even if you publish, it's not the whole solution. People want distribution with their publication. Centralized services offer ease of publication and ease of distribution. So unless the decentralized internet can offer a better solution to both, this story will play out again and again.
> For instance, I know Project A -- these are the concerns of Project A. I know Project B -- these are the concerns of Project B. I have the insight to design these projects so they compose, so I don't have to keep track of a hundred parallel issues in a mono Project C. On each of those projects, run a single agent -- with review gates for 2-3 independent agents (fresh context, different models! Codex and Gemini). Use a loop, let the agents go back and forth.
Can you talk more about the structure of your workflow and how you evolved it to be that?
I've tried most of the agentic "let it rip" tools. Quickly I realized that GPT 5~ was significantly better at reasoning and more exhaustive than Claude Code (Opus, RL finetuned for Claude Code).
"What if Opus wrote the code, and GPT 5~ reviewed it?" I started evaluating this question, and started to get higher quality results and better control of complexity.
I could also trust this process to a greater degree than my previous process of trying to drive Opus, look at the code myself, try and drive Opus again, etc. Codex was catching bugs I would not catch with the same amount of time, including bugs in hard math, etc -- so I started having a great degree of trust in its reasoning capabilities.
It's a Claude Code plugin -- it combines the "don't let Claude stop until condition" (Stop hook) with a few CLI tools to induce (what the article calls) review gates: Claude will work indefinitely until the reviewer is satisfied.
In this case, the reviewer is a fresh Opus subagent which can invoke and discuss with Codex and Gemini.
One perspective I have which relates to this article is that the thing one wants to optimize for is minimizing the error per unit of work. If you have a dynamic programming style orchestration pattern for agents, you want the thing that solves the small unit of work (a task) to have as low error as possible, or else I suspect the error compounds quickly with these stochastic systems.
I'm trying this stuff for fairly advanced work (in a PhD), so I'm dogfooding ideas (like the ones presented in this article) in complex settings. I think there is still a lot of room to learn here.
I'm sure we're just working with the same tools thinking through the same ideas. Just curious if you've seen my newsletter/channel @enterprisevibecode https://www.enterprisevibecode.com/p/let-it-rip
He didn't need to finish it in order for it to have an impact. Makers of FilePilot and Animal Well both attribute Handmade as being big inspirations for them to go the way they did. They said, they got the most value from the first 50 eps or so. You'll hear lots of them on the Wookash podcast.
So for your opinion to carry any weight, please enlighten us as to the games you have shipped that qualify you to comment on their take on programming practices.
Not really. Let's reverse the situation on you - why should we take your opinion seriously, we have no idea how much you have shipped, if anything at all, so by your logic, your ragging on the other programmers practices is ridiculous.
I've shipped a few things over the years, but doubt I have strong takes in programming, besides 'the "properness" of a variables name is dependent on the amount of lines between it's definition and usage.' Doubt anyone will take my considerations seriously.
I'm not making any claims about programming practices
If someone comes out saying "you guys are all doing this wrong" and yet they can't finish their own project then why would I take their advice seriously?
If you suggest a way of doing software development and you can't even show it working out to completion, what does that say about your proposed methods?
I had a larger rant written, but this is the only part that had any value:
Yes, one can argue that lack of produces results does not give big plusses towards their work processes, but it does not necessarily negate the value of the concepts that they preach.
The value of a thing is not only defined by who is spouting it, one must evaluate the argument on it's own merits, not by evaluating of the people yelling about it.
There are plenty of concepts in this world that I cannot make work, that does not mean that the concepts are bad. It only means that the failure reflects on me and my in-capabilities.
And this might be something that you are not noticing: You are making claims about programming practices indirectly by stating that THEIR practices are not worth considering due to lack of shipping anything.
It's not really the same. Casey is suggesting people that don't spend 10 years crafting everything from scratch are somehow "lesser than." The user you're replying to is pointing out that Casey has set a completely arbitrary rule for game quality that conveniently leaves out his inability to ship something, and that's funny.
We're not saying games taking longer than a few years are failures, we're saying good games can encompass both approaches. But Casey, and his followers, are doing purity tests to feel good about themselves.
And this is assuming the games they ship are even good or noticeable to the user. I don't much care for Braid or The Witness, and I don't want my favorite dev studios to suddenly write everything from scratch every time. I would have a lot less fun things to play.
https://interjectedfuture.com
reply