Some of us do actually have intimate knowledge in certain areas where guidance of an AI takes longer than doing it yourself. It's not about typing speed, it's that when you know something really really well the solution/code is already known to you or the very act of thinking about the problem makes the solution known to you in full. When that happens it's less text to write that solution than it is to write a sufficient description of the solution to AI (not even counting the back and forth required of reviewing the AI output and correcting it).
This is actually my biggest gripe with vibecoding. The single best feature of any programming language is that it is precise. And that is what we throw out?! I favor of natural language, of all things?! We're insane!
It turns out an awful lot of precision (plenty for many things) lives in library and web APIs, documentation, header files and dependency manifests. Language can literally just point at it without repeating it all. Avoidance of mistake through elimination of manual copying in things like actuarial and ballistics tables was what the original computers were built for.
API Glue is the easy and boring part in programming. Nobody really enjoys wiring API A to API B, combining the results and using API C to push it forwards.
Any semi-competent AI Agent can do that with a plan you've written in 5 minutes.
I would love to see an AI try to make sense of GTK API.
I may be wrong, but it seems when people are talking about easy glue code, they’re talking about web services API, not OS API, not graphics or sound API, not file formats libraries,…
I used Sonnet 3.5 over a year ago to decrypt a notoriously shitty local government API to get data out of meetings, votes and discussions.
I know it's a piece of shit API done in the worst possible way on purpose (they don't want openness, but had to fulfill a law that mandates "openness") because I had previously tried to do it manually - twice. I ran out of whisky before I got anything done.
Sonnet _3.5_ almost one-shotted it with just the API "documentation" they had and access to Python and curl.
People have also hooked stuff into proprietary APIs on "smart" devices with zero documentation, just by having an Agent tirelessly run through thousands of permutations to figure it out.
Historically we almost entirely moved from ASM to C, a language with lots of undefined behavior, because precision is not the most valued feature of languages.
It's the existence of UB that is the reduction in precision. A language without UB is more precise, in my view, than one that has UB. I don't know if this a conventional view. But being able to write parsable, compiler-receivable code that does 'uh, whatever', feels like a reduction in precision to me compared to a language that does not have that property.
Otherwise, we're just saying that the precise parts of the language are precise, which isn't much of a differentiator since it's similarly true for all languages.
UB is about edge cases that a compiler should not be enforced to check against and an occurrence is always a bug. You don't necessarily need a precise description of the actual faulty behavior.
Right. The language has well-formed expressions with no defined meaning in terms of machine instructions. My claim is that this is a reduction in precision compared to assembly language.
Grandparent said:
> The single best feature of any programming language is that it is precise.
C overtook a more precise language family because it has features other than precision that people cared about. Perhaps a better tradeoff of expressiveness and readability with precision.
Grandparent could be correct, and precision is the best feature of C, despite being less precise than ASM. And its better expressiveness nets out to a better overall programmer experience. I just wanted to point out that precision is something we do trade away for other things we want.
I don't completely follow the analogy, but I do follow the argument. High precision regarding the requirements often is not needed and that's exactly where LLMs shine.
That's also where engineers come into play. They (and often only they) can judge how much precision is needed depending on the part of the system they are working on.
Could you please explain why you feel that having UB makes C less precise than asm?
To me, the notion of precision isn't in any way related to whether any given statement is sound. It's about the behavior of the language for sound programs.
There are syntactically well-formed C programs that are not sound programs because their behavior is undefined. Or, rephrasing: a subset of all parseable C programs contain 'do whatever, I dunno'. I interpret this as a lack of precision.
One could take the position that specifying precisely 'do whatever, I dunno' counts as perfectly precise. But then a language that was entirely UB would count as precise, which would be an odd position to hold, since you can't specify any behavior at all with it.
Nobody seriously interprets "the C programming language" as "parseable C". Of course there's parseable, undefined C, and of course it's very imprecise. It's not relevant.
Now consider sound, in-spec C. Versus natural language.
Ok I’ll do the same move and show why it doesn’t persuade me.
Consider the subset of natural language that has strictly defined semantics. This would include, for example, talking in about the arithmetic of real numbers. The rest is not relevant to evaluating the precision of natural language.
Does that exclusion feel different in the natural language case? Why?
Perhaps it’s a matter of degree, not a categorical difference.
The exceptions that prove the rule. When your programming language is built up of singular Unicode characters with specific meanings, of course that's faster than typing out in English what you want.
What do you use them for? For most AI users it's usually CRUD and I've never seen a web server or frontend in APL like languages.
The reason why programming is hard is because most languages force you to use a hammer when you need a screw driver. LLMs are very good at misusing hammers and most people find them useful for that reason.
If you use a sane dsl instead the natural language description of a problem is always more complex and much longer than the equivalent description in a dsl. It's also usually wrong to boot.
I don't think you will find anyone who can do better than an LLM at one shotting the prose version of the problem. Both will of course be wrong.
But I also don't think you will find an LLM that can solve the problem faster than a human with Prolog when you have to use the prose description of the problem.
Using esoteric programming languages doesn’t suddenly make it true for the majority of development, which is web apps, CRUD stuff, some data science, etc.
Who is using APL and J these days? I guarantee 90+% of Claude users are developing CRUD web apps, or something similar. Your point about algebra is a non sequitur to what people are actually developing for these days.
The volume of people successfully adopting agentic engineering practices suggests this stuff isn't rocket science, but it is a learned skill and takes setup.
A year later into heavy AI coding, my experience is what you're describing should aid in being able to run 5+ agents simultaneously on a project because you know what you're doing, you set it up right, and you know how to tell agents to leverage that properly.
More LOC committed per day is probably the only one that's guaranteed when you let spicy autocomplete take the wheel.
I don't think it's at all possible to reason about the other more meaningful metrics in software development, because we simply don't have the context of what each human is working on, and as with the WYSIWYG fad of 3 decades ago, "success" is generally self-reported, by people who don't know what they don't know, and thus they don't know what spicy autocomplete is getting woefully wrong.
"But it {compiles,runs,etc}" isn't a meaningful metric when a large portion of the code in question is dynamic/loosely typed in a non-compiled language (JavaScript, Python, Ruby, PHP, etc).
If you are on the right team with the right professionals you can measure. when we first started using LLMs we decided to run the same process as if they did not exist, same sprint planning meetings, same estimation. we did this for 6 months and saw roughly 55% increase in output compared to pre-LLM usage. there are biases in what were tried to achieve, it is not easy to estimate something will take XX hours when you know some portion (for example writing documentation or portions of the test coverage) you won’t have to write but we did our best. after we convinced ourselves of productivity gains we stopped doing this. saying you can’t measure something is typical SWE BS like “we can’t estimate” and the other lies we were able to convince everyone off successfully
Maybe you're the exception and are actually doing it right and actually getting good results, but every time I have heard this, it has been an ignorance-is-bliss scenario where the person saying it is generating massive amounts of code that they don't understand, not because they're incapable but because they don't care to, and immediately wiping their hands of it afterward.
To give an example of where I hear this, it is indistinguishable from the things I hear from my coworkers: "You just need the right setup!" (IMO the actual difference is I need to turn off the part of my brain that cares about what the code actually does or considers edge cases at all)
What I actually see, in practice, are constant bugs where nobody ever actually addresses the root cause, and instead just paves over it with a new Claude mass-edit that inevitably introduces another bug where we'll have to repeat the same process when we run into another production issue.
We end up making no actual progress, but boy do we close tickets, push PRs, and move fast and oh man do we break things. We're just doing it all in-place. But at least we're sucking ourselves off for how fast we're moving and how cutting edge we are, I guess.
I dunno, maybe I'm doing it wrong, maybe my team is all doing it wrong. But like I said the things they say are indistinguishable from the common HN comment that insists how this stuff is jet fuel for them, and I see the actual results, not just the volume of output, and there's no way we're occupying the same reality.
I've seen productivity surveys of senior programmers that share the reverse, and that matches our experience. A common finding is that gardening projects are a lot cheaper now when they're just a few extra terminal tabs running in parallel - security, refactoring, more testing, etc. Non-feature backlog items that senior developers value around tech debt are less of a discussion now. They're often essential now: to make AI coding work well, there is an effective automation poverty line around verification, testing, and specification that needs to be reached.
The understanding code thing is tough. Eg, when a non-senior fullstack developer manually edits frontend css code and didn't start from pixel-perfect designs across all form-factors, do they really understand what they did? I wrote the first formal mechanized specification of the CSS standard, and would claim 95%+ of web developers do not understand core CSS layout rules to beginwith: it was a struggle to semantically formalize even a tiny core of the box model as soon as you have floats. If the AI generates live storybooks and in-tool screenshots of all these things as part of the review process, and doing code review "looks good", what's the difference?
I don't truly think this way - my point is to challenge basic claims of manual coding to be good to begin with and whether AI coding is being held to an artificial standard. What I see in commercial and defense software is a joke compared to what we do in the verification world. AI coding automating review iteration fixes in areas like security engineering and test coverage+amplification has been a blessing for quality improvement.
More fundamentally, we require developers by default to be responsible for knowing what the code does and having tested it. Every case of relaxing that rule has to be explicit, eg, clear that something is a prototype, or an area is vibed with what alternate review/test flow, and we are learning as a team what that means in different situations. In practice, our senior ai coders are doing more quality engineering work than the manual coders, both per-pr and in broader gardening contributions.
I know you said you don't truly think that way, but to counter anyway since some people seem to legitimately hold this viewpoint:
I take issue with the implication that not necessarily having a full understanding of what the code/library/driver/compiler/abstraction is doing is somehow justification/permission to embrace and celebrate having basically no understanding of what any of the code is doing. The in-between space there is the vast majority of the surface area where nuance can and should exist.
>my point is to challenge basic claims of manual coding to be good to begin with and whether AI coding is being held to an artificial standard
That's fair, and I can only speak for myself here; I don't have any inherent philosophical issue with manual vs AI, but my personal experience is that AI coding is just straight-up a frustrating nightmare to deal with, IMO orders of magnitude worse than manual. It's faster, sure, but I end my rage-filled LLM debugging session walking away knowing I learned pretty much nothing and that there's no compounding knowledge or outcome that will keep me from experiencing the same thing tomorrow, and I hate that. I am Sisyphus rolling prompts into a terminal.
But I'm not gonna sit here and act like manual coding makes you morally virtuous or pure or whatever. IMO it's a great forcing function to better (even if not completely) understand what is going on in your system(s) and I think most everyone would agree with that. What's up for debate is probably whether that's worth the time tradeoff now that we have a magic time compressor machine available to us.
Maybe I only find that knowledge tradeoff valuable because I'm a lowly IC and not some super turbo chad 10x principal who built a distributed database in brainfuck 10 years ago for fun and has nothing left to learn, or a technical founder of 5 concurrent startups who is optimizing for business value. It's possible that a heavy bias for learning/skill acquisition blinds me here.
>we require developers by default to be responsible for knowing what the code does and having tested it. Every case of relaxing that rule has to be explicit
1. If what you're replying to was a thing, wouldn't there be a open source project where I could see this in action? or Some sort of example I could watch on youtube somewhere. 2. The people that talk like this in my company, spin up new projects all the time and then just get to hand them off for other teams to clean up the mess and decode what the heck is going on.
1. Probably most of https://github.com/simonw , but take care to seperate adopted / semi-professional from exploratory personal work
2. That sounds like your company has a weak engineering culture and is early on its upskilling journey. We explicitly seperate projects into prototypes vs production, where vibes are fine for the former, eg, demos by designers / data scientists / sales engineers but traditional code review standards for whatever is going into production. That mirrors my qualifier in #1.
I find that success here is a combination of engineering seniority, prompting experience, and domain experience . Anything lacking breaks the automation loop, like not knowing how and what to automate. Ex: All of our team finds value in ai coding, but junior engineers struggle on these dimensions, so are not running the 3+ agents that senior ones are.
You seem to have missed OP's point: some things are only encoded in our brains when you are sufficiently experienced.
Translating that into code can happen directly by you, or into prompt iterations that need to result in the same/similar coded representation.
In other words, when it matters how something works and it is full of intricate details, you do not need to specify it, you just do it (eg. as an example which is probably not the best is you knowing how to avoid N+1 query performance issue — you do not need a ticket or spec to be explicit, you can just do it at no extra effort — models are probably OK at this as it is such a pervasive gotcha, but there are so many more).
That's the failure to automate. The AI isn't telepathic, so agentic engineers not automating this stuff is skipping out on the engineering part.
You setup the environment and then you do the work. Unless you are switching employers every week, you invest in writing that stuff down so the generation is right-ish and generate validation tooling so it auto-detects the mistakes and self-repairs.
sometimes you write the feature and write it well so it's reusable.
imagine you have to implement a specific algorithm for a quantum computer.
There's no value setting up AI to do the writing for you. That might be orders of magnitude harder then writing the algorithm directly.
For highly specialized one-off features, it doesn't always pay off.
On the other hand, if all you do are some generic items that AI can do well... then I'm not sure you're going to have a job long term, your prompts and automation will be useful for the new junior hires that will be specialized in using these and cost effective.
That feels like true in theory, but in practice, we see the reverse for advanced projects where AI is helping us a lot. A decent chunk of our core IP falls into the bucket you're describing:
We have been building a GPU-accelerated graph investigation platform that has grown over 10+ years with fancy stuff all over the place - think accelerated query languages, layout kernels, distribution, etc. R&D-grade high performance engineering projects and kernels end up needing a lot of iterations to make a prototype and initial release. Likewise, they're more devilish to maintain when they need a small tweak later because of the sophistication and bus factor. Both phases benefit.
AI coding helps automate investigation, testing, measurement, patching, etc. The immediate effect is we can squeeze in many more experimental iterations with more fidelity and reach. Having an AI help automatically explore the design space and the details helps a LOT. And later, maintaining a wide surface area of code here that is delicate to touch and infrequently edited is traditionally stressful for teammates, and AI editing + AI-generated automation is helping destress that a LOT. We very much invest in upgrading our team, processes, and tooling here.
I think there's a level above that where the words to describe such structure are familiar and readily available and hey guess what? The model understands those too. Just about every pattern has a name. Or a shape. Or an analog or metaphor in other languages or codebases. All work as descriptors.
This presumes that most of this stays encoded as words in our brains: the effort to translate some of these into words might be similar to translating it into code (still words, just very precise).
It's like talking legalese vs plain English; or formal logic vs English. Some people have the formal stuff come more naturally, and then spitting code out is not a burden.
No, it really doesn't presume anything about brains or information encoding. Just points out that there is a level of mastery in which all the techniques and all the forms have names or adequate descriptions. Teachers often attempt to achieve this, to facilitate education.
It's no accident there is an adage from Aristotle in the vein of: "Those who can, do. Those who understand, teach."
So yes, there is a level of mastery that is beyond being able to do a good job of designing and evolving complex systems which enables people to teach others the same skill set.
However, this is a smaller number of practitioners, and most have learned through practice and looking over how more experienced engineers apply their knowledge.
Where I disagree is that this means everybody is equally capable of teaching with words, or that there are no experts who are bad at teaching (humans or directing AI) — this clearly indicates it is not encoded as words for said experts.
It's been pretty clear in my experience that experts tend to be capable of working with the same ideas in many different forms. That's what I would call mastery. It implies "complete" knowledge, which probably means several interrelated encodings with loci in different parts of the brain. Those interrelated encodings will be highly associated, and discerning in an expert. Which implies a high degree of usefulness and specificity in communication. This matches my experience.
Yes, there are still many areas where skilled humans are faster than AI (meaning faster coding yourself, than providing so much context and guidance that the AI can do it on its "own").
But in general the statement is really not true anymore, generic projects/problems have a pretty good chance that the AI can one shot a working solution from a lazily typed vague prompt.
Yeah it’s when you go off the happy path that it gets difficult. Like there’s a weird behaviour in your vibe-coded app that you don’t quite know how to describe succinctly and you end up in some back-and-forth.
But man AI is phenomenal for getting stuff out of your head and working quick.
That doesn't matter. The statement wasn't "faster than AI right now", it was "will always be faster than AI". And that's just nonsense.
Current AI systems are extremely serial, in that very little of the inherent parallelism of the problem is utilized. Current-gen AI systems run at most a few hundreds of thousands of operations in parallel, while for frontier models, billions of operations could be run in parallel. Or in other words, what currently takes AI 8 hours will take it barely long enough for you to perceive the delay after you release the enter key.
For a demo, play around with https://chatjimmy.ai/ , the AI chatbot of Taalas, where they etched the model into silicon in a distributed way, instead of saving it in RAM and sucking it to execution units by a straw. It's a 8B parameter model, so it's unsuitable for complex problems, but the techniques used for it will work for larger models too, and they are working to get there.
And even Taalas is very far from the limits. Modern better quality LLM chatbots operate at ~40 tokens per second. The Taalas chatbot operates at 17000 tokens/s. If you took full advantage of parallelism, you should be able to have a latency of low hundreds of clock cycles per token, or single request throughput of tens of millions of tokens per second. (With a fully pipelined model able to serve one token per clock cycle, from low hundreds of requests.) Why doesn't everyone do it like that right now? Because to do this, you need to etch your model into silicon, which on modern leading edge manufacturing is a very involved process that costs hundreds of millions+ in development and mask costs (we are not talking about single chips here, you can barely fit that 8B model into one), and will take around a year. So long as the models keep improving so much that a year-old model is considered too old to pay back the capital costs, the investment is not justified. But when it will be done, it will not just make AI faster, it will also make it much more energy-efficient per token. Most of the energy costs are caused by moving data around and loading/storing it in memory.
And I want to stress that none of the above is dependent on any kind of new developments or inventions. We know how to do it, it's held back only by the pace of model improvement and economics. When models reach a state of truly "good enough", it will happen. It feels perverse to me that people are treating this situation as "there was a per-AI period that worked like X, now we are in a post-AI period and we have figured out that it will work like Y". No. We are at the very bottom of a very steep curve, and everything will be very different when it's over.
Huh, I have to say that I am impressed with Chat Jimmy. No doubt that the hardware running this model operates faster than any human. If this was possible to scale, (and I'm not saying it isn't, I just don't think it's likely right now) LLM's have a real shot of replacing real-time graphics, frontend UIs, and all sorts of interactive media if the market allows it.
I still think regardless of how fast a model outputs tokens, it still benefits the person responsible for that output to be well informed and knowledgeable about the abstractions they're piling on top of. If you have deep knowledge, you can operate faster than other people, and make those important decisions in a more intelligent manner than any model.
Maybe in the model we do get super intelligence and my point will finally break, but at that time I don't think I'll be worried about being wrong on the internet.
Ok sorry about that. I seriously don't believe him. The Agent is so fast there's literally no way you can be faster.
Telling the agent your high level plan that you are extremely familiar with and then having the agent execute on 2000 lines of code is FASTER then having you execute on that 2000 lines of code. There is no reality where that can be physically beaten by even someone who's typing really quickly with zero pause. Physically impossible.
Less boring or not? Another way to put it... although my answer is boring, I think I'm right. He is either a liar or like many other people lacks skill in using AI... because the transition to AI is happening so fast... not many people are fully utilizing AI to it's maximum potential. Many still use IDEs, many still interact with terminal. Many people still don't use it to configure infrastructure, do database administration, deploy code... etc.
Why are you starting the clock at the time when you already have a "high level plan that you are extremely familiar with"? I think it's fairer to start from "I received a bug report/feature request" or similar.
Also, haven't you ever had a situation where the prompt you started with ends up being longer than the final code diff? Perhaps a subtle bug that's hard to describe/trigger, but ended up having a simple root cause like an off-by-one error?
Also also, coding agents are infamous for generating way more code than is strictly necessary. The 2000 lines of code that the agent generated may well have been only 200 lines had you written it yourself.
>Why are you starting the clock at the time when you already have a "high level plan that you are extremely familiar with"? I think it's fairer to start from "I received a bug report/feature request" or similar.
Done both. We tag the LLM on slack in a reply and the ticket gets created and forwarded to an agent that automatically works on it. The only time a human is in the loop is review or or queries for changes.
>Also, haven't you ever had a situation where the prompt you started with ends up being longer than the final code diff? Perhaps a subtle bug that's hard to describe/trigger, but ended up having a simple root cause like an off-by-one error?
Sometimes. Getting rarer and rarer.
>Also also, coding agents are infamous for generating way more code than is strictly necessary. The 2000 lines of code that the agent generated may well have been only 200 lines had you written it yourself.
Depends on the agent and it's random. This was mostly true probably 5 months ago. It's much less true now.
AI can write 2000 lines faster than you, but you can write the 2000 lines correctly first shot faster than having AI do 10 iterations on these 2000 lines with your guidance to finally get it right
I know that a better plan could mean fewer iterations, but again that extends the time you need to spend on that plan => the total time of the AI solution
Right but those 10 iterations only take up prompt writing time. When the agent is executing I move onto other tasks in parallel. AI is faster when you parallelize your work flow.
prompt writing and parsing the AI output, and thats still work you have to do - not sure why you bring up parallelism since you cant do other things while you're writing the prompts
Other agents can be working while you're writing prompts.
Let me put it more explicitly. For one project I have 10 folders clones of the same project on my local computer. Each one of those folders is responsible for working on a different ticket/feature. I prompt one folder, move on to the next. It takes practice to get used to this style.
Again it's not about typing speed. High level plans simply don't work very well, especially for big tasks where the optimal solution actually would take 2k lines. Unless you are building something that is extremely generic, AI coming up with the optimal solution rarely ever happens.
> He is either a liar or like many other people lacks skill in using AI
Not a liar, and I'm sorry to say, but AI really doesn't take much skill to use. People who say such statements give me the impression that their ceiling for skills is quite low.
Their are areas I do and will continue to use AI and it works well enough. Giving me prototypes for projects I don't have a lot of knowledge about is one thing. But I use those prototypes to learn.
> configure infrastructure
I make templates I can copy and tweak to do this faster than it takes to tell an agent what to do.
> database administration
Don't do that... Sure get it to write you some SQL to update a table, but don't give it DB admin access for fucks sake.
> deploy code
Tell me, how is your agent able to deploy code more effectively than hitting merge on a PR? Or do you simply mean setting up CI/CD for you? That's usually a set and forget thing that doesn't take much time, so I'd rather do it myself.
>Again it's not about typing speed. High level plans simply don't work very well, especially for big tasks where the optimal solution actually would take 2k lines. Unless you are building something that is extremely generic, AI coming up with the optimal solution rarely ever happens.
Nope. Not universally true. It depends on randomness of the rng, the type of task, the agent, and also the current state of AI. Right now for frontier models... what you're saying is generally true only in the minority of times ime.
>Not a liar, and I'm sorry to say, but AI really doesn't take much skill to use. People who say such statements give me the impression that their ceiling for skills is quite low.
It does take a little skill. Very little and it requires new habits that are harder to pick up. For example. I never work on one project at a time anymore. I work on 5 projects and context switch between all of them. Prompt, switch, come back, prompt, switch, prompt switch, review... etc. That takes getting used to.
>I make templates I can copy and tweak to do this faster than it takes to tell an agent what to do.
I have a huge change, and within that change the agent does this automatically.
>Don't do that... Sure get it to write you some SQL to update a table, but don't give it DB admin access for fucks sake.
You can fuck off prick, don't fucking talk like that to my face. I do it and I have no problems with it. If you don't want to, that's your own fucking prerogative.
>Tell me, how is your agent able to deploy code more effectively than hitting merge on a PR? Or do you simply mean setting up CI/CD for you? That's usually a set and forget thing that doesn't take much time, so I'd rather do it myself.
Because the agent merges for me. Prompt: "Complete task A". Agent: "Task completed", Me: "reviewed and good to go"
The agent then does it's thing. Of course there's always some adjustments and more conversation then this but that's the jist of it.
I interpret "faster than AI" to include writing the prompt. For me (scientific computing) it is more often than not faster to write out a simulation or design in a language I know inside out like fortran or mathematica than explicate the requirements to an LLM to request the code. Obviously if someone wrote out a prompt to me and the LLM it would be way faster, but I don't think that's what the commenter had in mind.
If you're good at SQL, or SQL-like languages like Linq, it might be more efficient precisely writing a reasonably complex query than trying to explain it in detail to an AI.
I am very good at SQL, I worked half my life with SQL and teached it and know all kinds of SQL flavour. But good luck getting ahead of AI on a complex query with recursive CTEs, left outers, 625-column tables that change semantics conditional to certain prop, and then some obscure Oracle package APIs.
No way U beat an LLM on this, even on trivial ones. LLMs are better at that since at least 2024, if you haven't noticed, then you're not doing enough SQL perhaps.
But, of course it took years for people to realize they cannot outpace Visual Studio in the 90s by being very good at x86 assembly.
Not the parent but I've had this happen when debugging for sure. Sometimes I ask Claude Code to help me debug something and it makes a wrong assumption and just churns in circles burning tokens. While it's doing that I realize the problem and fix it.
What I meant is that only sometimes I am faster than Claude with debugging. When it's a standalone problem, a report in Sentry, and I just know immediately where I need to go to fix it. Then it's faster to do myself, than telling Claude what's the problem and where to look and wait.
Bugs happen during feature development, as you say, but then Claude is in the context, and I don't need to tell it where to go, it sees the bug with failing tests, or smth similar.
BTW. One thing that helps my Claude with debugging harder problems is that I tell it to apply scientific method to debugging. Generate hypotheses, gather pros/cons evidence, write to a journal file debug-<problem>.md, design minimal experiments to debunk hypotheses.
You can add that as a skill, and sometimes it will pick it up automatically, but it works wonders just as a single sentence in the input.
..but then you ignore all other times CC got it right, and statistically I would put my bets CC does it right (or Codex (or PI)) than you would, and more often is right than tis not.
besides it is a system that you query, it responds. I'm sure your dbs are not always 'right' and particularly when you as the wrong questions.
I don't really want to advocate for Musk, but is it not possible that his goal was to merge with Tesla as an alternative to OpenAI becoming a seperate for-profit. If the option of staying a non-profit was going off the table I'd also probably want to advocate for merging with an existing for-profit I own that had aligned interests.
It's certainly the case that the collaborative ceremony can be mismanaged, and that is frustrating when you need time to implement. I don't expect that complaint to go away, those who are using AI heavily will replace it with not having enough time for prompting.
But I have also worked with some who refused to participate in collaboration, they felt their time and ideas superior to others, and there's no excuse for that.
The whole Prompt API is poorly designed. Devs will end up trying to fine tune very specific prompts out of necessity, only to have them break with the next model update.
The logic around not providing access to model version to prevent fingerprinting is laughable when the suggestion to counteract fingerprinting from prompting is the model should only update when user agent string updates. Just put the damn API behind a explicit user permission.
They want to force the prompt API into being a defacto standard without getting buy in by the rest of the web standards body. Having it on by default serves this goal.
I switched to Deno because it is the only option out of the 3 that allow monorepo workflow without building .d.ts files. Bun and Node both do type stripping or compiling of TS, but it only works for the entry package of the running script, not any of the linked dependencies from the same repo.
There are still things I dislike about Deno, but it really does make package development a lot simpler. JSR is a great upgrade from NPM, and Deno makes it so simple to publish to both NPM and JSR. Strict IO permission system and WebGPU support are also nice to have.
> wrap a project into an `*.exe`
Deno makes this simple too. Though that's where it's bundling features stop. Honestly I am okay with that, I'd rather use Rolldown or Vite for web or library bundling.
Deno has been great for wrapping the dozens of REST API's I need to use in the world in MCP. The no compilation thing means that I can push and it's literally deployed in seconds. I run several dozen of the little servers for various use cases, it's a very cheap way to build an automatable life
I do think simulating consciousness is within the realm of possibility. I also think it's absurdly silly to think LLMs (no matter their size) are conscious, if for no other reason than they can't actively learn.
I would maybe be comfortable classifying them as a snapshot of consciousness, but when you are interacting with an LLM it's far from interacting with a conscious entity.
How severe are we talking? I don't think there's any analog for how bad learning is for LLMs, needing multiple human lifetimes worth of data in order to be trained.
In the hypothetical case that a I truely lost all ability to learn, then yes I would no longer consider myself conscious. I'd be a echo of a previously conscious entity.
I strongly believe LLMs and API harnesses of today simply are not at the technological stage where such an API makes sense in standards.
However if this needs to be done, then it needs to be a opt-in per site permission at the very least, and there should be a way to verify the identity of which model is being prompted (which extends to even minor tweaks made to system prompts).
As a user I need to be sure that I can't be fingerprinted by navigating to a random site and them using this API without my permission.
As a dev I need to know what model my users are using, so I have the option to craft specific prompts per model.
Unfortunately I have seen some really good software engineering peers regress into bad engineers through a increasing reliance on AI.
Conversely some very bad engineers (undeserving of the title) have been producing better outputs than I ever expected possible of them.
reply