> Ignoring outright bad code, in a world where functional code is so abundant that “good” and “bad” are indistinguishable, ultimately, what makes functional AI code slop or non-slop?
I'm sorry, but this is an indicator for me that the author hasn't had a critical eye for quality in some time. There is massive overlap between "bad" and "functional." More than ever. The barrier-to-entry to programming got irresponsibly low for a time there, and it's going to get worse. The toolchains are not in a good way. Windows and macOS are degrading both in performance and usability, LLVM still takes 90% of a compiler's CPU time in unoptimized builds, Notepad has AI (and crashes,) simple social (mobile) apps are >300 MB download/installs when eight years ago they were hovering around a tenth of that, a site like Reddit only works on hardware which is only "cheap" in the top 3 GDP nations in the world... The list goes on. Whatever we're doing, it is not scaling.
I'd think there'll be a dip in code quality (compared to human) initially due to "AI machinery" due to its immaturity. But over-time on a mass-scale - we are going to see an improvement in the quality of software artifacts.
It is easier to 'discipline' the top 5 AI agents in the planet - rather than try to get a million distributed devs ("artisans") to produce high quality results.
It's like in the clothing or manufacturing industry I think. Artisans were able to produce better individual results than the average industry machinery, at least initially. But overtime - industry machinery could match the average artisan or even beat the average, while decisively beating in scale, speed, energy efficiency and so on.
The issue is that code isn't clothing. It's the clothing factory. We aren't artisans sewing clothing. We're production engineers deciding on layouts for robots to make clothes most efficiently.
I see this type error of thinking all the time. Engineers don't make objects of type A, we make functions of type A -> B or higher order.
Go concrete. In FAANG engineering jobs now what % is this factory designer category vs what % is writing some mundane glue code, moving data around in CRUD calls, or putting in a monitoring metric etc?
Once you look at the present engineering org compositions see what's the error in thinking.
There are other analogy issues in your response which I won't nitpick
> industry machinery could match the average artisan or even beat the average
Whether it could is distinct from whether it will. I'm sure you've noticed the decline in the quality of clothing. Markets a mercurial and subject to manipulation through hype (fast fashion is just a marketing scheme to generate revenue, but people bought into the lie).
With code, you have a complicating factor, namely, that LLMs are now consuming their own shit. As LLM use increases, the percentage of code that is generated vs. written by people will increase. That risks creating an echo chamber of sorts.
I don't agree with the limited point about fast fashion/enthittification, etc.
Quick check: Do you want to go back to pre-industrial era then - when according to you, you had better options for clothing?
Personally, I wouldn't want that - because I believe as a customer, I am better served now (cost/benefit wise) than then.
As to the point about recursive quality decline - I don't take it seriously, I believe in human ingenuity, and believe humans will overcome these obstacles and over time deliver higher quality results at bigger scale/lower costs/faster time cycles.
> Quick check: Do you want to go back to pre-industrial era then - when according to you, you had better options for clothing?
This does not follow. Fast fashion as described is historically recent. An an example, I have a cheap t-shift from the mid-90s that is in excellent condition after three decades of use. Now, I buy a t-shirt in the same price range, and it begins to fall apart in less than a year. This decline in the quality of clothing is well known and documented, and it is incredibly wasteful.
The point is that this development is the product of consumerist cultural presuppositions that construct a particular valuation that encourages such behavior, especially one that fetishizes novelty for its own sake. In the absence of such a valuation, industry would take a different direction and behave differently. Companies, of course, promote fast fashion, because it means higher sales.
Things are not guaranteed to become better. This is the fallacy of progress, the notion that the state of the world at t+1 must be better than it was at t. At the very least, it demands an account of what constitutes "better".
> I don't take it seriously, I believe in human ingenuity, and believe humans will overcome these obstacles
That's great, but that's not an argument, only a sentiment.
I also didn't say we'll experience necessarily a decline, only that LLMs are now trained on data produced by human beings. That means the substance and content is entirely derived from patterns produced by us, hence the appearance of intelligence in the results it produces. LLMs merely operate over statistical distributions in that data. If LLMs reduce the amount of content made by human beings, then training on the generated data is circular. "Ingenuity" cannot squeeze blood out of a stone. Something cannot come from nothing. I didn't say there can't be this something, but there does need to be a something from which an LLM or whatever can benefit.
> it is easier to 'discipline' the top 5 AI agents in the planet - rather than try to get a million distributed devs ("artisans") to produce high quality results.
Your take essentially is "let's live in a shoe box, packaging pipelines produce them cheaply en masse, who needs slow poke construction engineers and architects anymore"
Where have I said engineers/architects aren't necessary? My point is that it is easier to get AI to get better than try to improve a million developers. Isn't that a straightforward point?
What the role of an engineer in the new context - I am not speculating on.
> My point is that it is easier to get AI to get better than try to improve a million developers.
No it's not, your whole premise is invalid both in terms of financing the effort and in the AI's ability to improve beyond RNG+parroting. The AI code agents produce shoe boxes, your claim is that they can be improved to produce buildings instead. It won't happen, not until you get rid of the "temperature" (newspeak for RNG) and replace it with conceptual cognition.
Except I am not talking about clothing. You are guessing when you say "I'd think" based on your comparison to manufacturing clothing. Why guess and compare when you have more context than that? You're in this industry, right? The commodity of clothing is not like the commodity of software at all. Almost nothing is, as it doesn't really have a physical form. That impacts the economics significantly.
To highlight the gaps in your analogy; machinery still fails to match artisan clothing-makers. Despite being relatively fit, I've got wide hips. I cannot buy denim jeans that both; fit my legs, _and_ my waist. I either roll the legs up or have them hemmed. I am not all that odd, either. One size cannot fit all.
Artisanal clothing is functionally equivalent to mass-produced clothing, but more expensive.
Much of contemporary software is functionally equivalent but more expensive to run and produce than previous generations. Chat, project management, document editing, online stores… all seem to have gotten more expensive to produce and run with little to no gain in functionality.
Complexity in software production and tooling keeps increasing yet functionally software is more or less the same as 20 years ago (obv. excluding advancements depending on hardware like video, 3D rendering, LLMs, etc.
One issue is that tooling and internals have been optimized for individual people's tastes currently. Heterogeneous environments make the models spikier. As we shift to building more homogenized systems optimized around agent accessibility, I think we'll see significant improvements
Elegantly, agents finally give us an objective measure of what "good" code is. It's code that maximizes the likelihood that future agents will be able to successfully solve problems in this codebase. If code is "bad" it makes future problems harder.
> Elegantly, agents finally give us an objective measure of what "good" code is. It's code that maximizes the likelihood that future agents will be able to successfully solve problems in this codebase. If code is "bad" it makes future problems harder.
An analogous argument was made in the 90's to advocate for the rising desire for IDEs and OOP languages. "Bad" code came to be seen as 1000+ lines in one file because you could simply conjure up the documentation out-of-context, and so separation of concerns slipped all the way from "one function one purpose" to something not far from "one function one file."
I don't say this as pure refusal, but to beg the question of what we lose when we make these values-changes. At this time, we do not know. We are meekly accepting a new mental prosthesis with insufficient foresight of the consequences.
I don't disagree that shit is gonna get weird and not all the changes are good.
Ultimately I think we need to move away from the concept of codebases as they currently exist, towards "databases" of functionality that get composed as needed. Agents are going to make mixing and matching for bespoke purposes such a central paradigm that large monolithic packages don't make as much sense as they used to, just like monolithic apps make less sense in a world where agents programmatically call tools to do most work on computers.
Through my own experience moving from IDE & higher-level languages to a simple textual editor and (nullable) systems languages, I've noticed that the way I read and write code is entirely different, and I can remember my "old eye." I think most people reading this view your "noise" as their signal. They get a good feeling when resolving tooling diagnostics (doubled if in an unfamiliar-to-them domain like systems programming.) It makes them feel secure. In that sense, I agree very much with the article: "I honestly think a lot of this discussion is fundamentally a misunderstanding of different perspectives rather than anything technical."
Personally, I've noticed that I pause less and write more now that I've got less tooling in my editor, and that's very enjoyable. I'd encourage anyone to try it. I have no autocomplete or "hover" info, just compilation errors presented aside the offending lines.
The business is this: Tailwind is free. Everyone uses it. People visit their docs and eventually buy some of the things they actually sell (like books, support, etc).
With LLMs, almost nobody visits their docs anymore just like folks barely visit Stackoverflow anymore (SOs traffic is down +80%). Fewer people see things they may want to buy from team Tailwind so they make less money so they implode. Plus LLMs just directly compete with their support offering.
Doesn't make much sense to me. It's literally a conversion of CSS rules to classes. Bootstrap already had a few of these as utility classes. I know it does a bit of magic in the background.
They made money off selling preset components and documentation etc, but as others have said, AI has pretty much ripped this off.
One of those things trying to monetise out of nothing because it became popular.
they had over 2M in revenue in 2024... then AI happened and it likely dried up, they staffed up during the boomtime and now are rightsizing based on the change of landscape.
> I suspect that part of the appeal is that null pointers are low-hanging fruit. They're easy to point out and relatively easy to "solve" in the type system, which can make the win feel larger than it actually is.
I agree. I find that Options are more desirable for API-design than making function-bodies easier to understand/maintain. I'm kind of surprised I don't use Maybe(T) more frequently in Odin for that reason. Perhaps it's something to do with my code-scale or design goals (I'm at around 20k lines of source,) but I'm finding multiple returns are just as good, if not better, than Maybe(T) in Odin... it's also a nice bonus to easily use or_return, or_break, or_continue etc. though at this point I wouldn't be surprised if Maybe(T) were compatible with those constructs. I haven't tried it.
To make `Maybe(T)` feel like a multiple return, you just need to do `.?` so `maybe_foo.? or_return` would just the same.
But as you say, it is more common to do the multiple return value thing in Odin and not need `Maybe` for a return values at all. Maybe is more common for input parameters or annotating foreign code, as you have probably noticed.
My feelings have evolved so much on Option types... When Swift came around I'm sure I would've opposed the information in this article. For two reasons.
1. I was a mobile dev, and I operated at the framework-level with UIKit and later SwiftUI. So much of my team's code really was book-keeping pointers (references) into other systems.
2. I was splitting my time with some tech-stacks I had less confidence in, and they happened to omit Option types.
Since then I've worked with Dart (before and after null safety,) C, C++, Rust, Go, Typescript, Python (with and without type hints,) and Odin. I have a hard time not seeing all of this as preference, but one where you really can't mix them to great effect. Swift was my introduction to Options, and there's so much support in the language syntax to help combat the very real added-friction, but that syntax-support can become a sort of friction as well. To see `!` at the end of an expression (or `try!`) is a bit distressing, even when you know today the unlikelihood (or impossibility) of that expression yielding `nil.`
I have come to really appreciate systems without this stuff. When I'm writing my types in Odin (and others which "lack" Optionals) I focus on the data. When I'm writing types in languages which borrow more from ML, I see types in a few ways; as containers with valid/invalid states, inseparably paired with initializers that operate on their machinery together. My mental model for a more featureful type-system takes more energy to produce working code. That can be a fine thing, but right now I'm enjoying the low-friction path which Odin presents, where the data is dumb and I get right to writing procedures.
Yes it's the burden of proof. That's why writing Rust is harder than C++. Or why Python is easier than anything else. As a user and customer, I'd rather pay more for reliable software though.
Odin offers a Maybe(T) type which might satisfy your itch. It's sort of a compromise. Odin uses multiple-returns with a boolean "ok" value for binary failure-detection. There is actually quite a lot of syntax support for these "optional-ok" situations in Odin, and that's plenty for me. I appreciate the simplicity of handling these things as plain values. I see an argument for moving some of this into the type-system (using Maybe) when it comes to package/API boundaries, but in practice I haven't chosen to use it in Odin.
Maybe(T) would be for my own internal code. I would need to wrap/unwrap Maybe at all interfaces with external code.
In my view a huge value addition from plain C to Zig/Rust has been eliminating NULL pointer possibility in default pointer type. Odin makes the same mistake as Golang did. It's not excusable IMHO in such a new language.
Both Odin and Go have the "zero is default" choice. Every type must have a default and that's what zero signifies for that type. In practice some types shouldn't have such a default, so in these languages that zero state becomes a sentinel value - a value notionally of this type but in fact invalid, just like Hoare's NULL pointer, which means anywhere you didn't check for it, you mustn't assume you have a valid value of that type. Sometimes it is named "null" but even if not it's the same problem.
Even ignoring the practical consequences, this means the programmer probably doesn't understand what their code does, because there are unstated assumptions all over the codebase because their type system doesn't do a good job of writing down what was meant. Almost might as well use B (which doesn't have types).
This gets you dynamic dispatch, roughly via the C++ route (inline vtables in implementing types). This means you must always pay for this on the types which provide it, even if you rarely use the feature, removing those vtables makes it unavailable everywhere.
A lot of programmers these days want static dispatch for its ergonomic value and Odin doesn't help you there. Odin thinks we should suck it up and write alligator_lay_egg_on(gator, egg, location) not gator.lay_egg_on(egg, location)
If we decide we'd prefer to type gator->lay_egg_on(egg, location) then Odin charges us for a vtable in our Alligator type, which we didn't need or want, and then we incur a stall every time we call that because we need to go via the vtable.
Oh, nice. I have to admit I'm not all that familiar with Odin, because I've been all-in on Zig for a long time. I've been meaning to try out a game dev project in Odin for a while though, but haven't had the time.
I'm sorry, but this is an indicator for me that the author hasn't had a critical eye for quality in some time. There is massive overlap between "bad" and "functional." More than ever. The barrier-to-entry to programming got irresponsibly low for a time there, and it's going to get worse. The toolchains are not in a good way. Windows and macOS are degrading both in performance and usability, LLVM still takes 90% of a compiler's CPU time in unoptimized builds, Notepad has AI (and crashes,) simple social (mobile) apps are >300 MB download/installs when eight years ago they were hovering around a tenth of that, a site like Reddit only works on hardware which is only "cheap" in the top 3 GDP nations in the world... The list goes on. Whatever we're doing, it is not scaling.
reply