Hacker Newsnew | past | comments | ask | show | jobs | submit | redfloatplane's commentslogin

It’s funny you say that as about halfway through I was beginning to wonder if this was at least Claude-edited. Absolutely no shade to the author meant, I think it’s a thoughtful article, but I _did_ feel the sheen of AI co-authorship.

It raises the question of how much text I have read that I did not realise was LLM-generated. I think I have a decent nose for it but I’m not perfect, there must be false negatives (and false positives, as it certainly might be with this article). What will it mean when I can no longer tell the difference?

Edit: thinking on it a little more, I hope the author doesn’t feel insulted by my comment given the subject matter of the article at hand. Sorry, it’s early morning! I’m sure I am wrong about my assessment. Which now really makes me wonder about the above


Hey! I'm not insulted at all. My position is that of a Luddite: I think technology is neutral, but deployment is not. My critique is structural, and I don't blame people in or out of tech for adopting AI to be able to survive.

No AIs were harmed in the writing of this post, either physically or by the sharing of earlier (cringe) drafts.


Thank you for writing this piece, resonated a lot with me. Looking forward to reading more from you in the future.

> What will it mean when I can no longer tell the difference?

It just means that you will have to evaluate prose on its own merits (aesthetic, logical, etc).

The main problem with LLM-assisted writing is that effort-to-write is now much lower than effort-to-read -- the LLM-prose-style is simply an imperfection that can sometimes help the reader bail on a piece (and there might be false-positives).

Most people are already biased against reading long pieces, and seem to skim them more often than not. These people are _probably_ a little worse off than before, but they are not paying full-price for being hoodwinked. The people who end up paying full-price are probably going to become more sophisticated in how they choose what to read. I can't tell if this will be good/bad for publishers and/or advertisers.


Pangram agrees with you. About 25% of the text trips the detection threshold, mostly towards the later half.

I don't want to make any accusations, just give some evidence to the above comment.


I have bad news for you...

Believe me or not, that's good news for me. I actually really enjoyed your writing, and I'm glad that feeling wasn't misplaced. I'm sorry if my remarks came across as mean-spirited...

Nothing to apologize for!

I (and I'm sure many others) have been thinking about this a lot over the last couple of months. I called it "Extremely Personal Software" in a blog post a few months ago (https://redfloatplane.lol/blog/14-releasing-software-now/) but there are lots of names and concepts floating about for the same basic idea.

I think it's possible the amount of new software that will be written for an audience of 1-10 will be greater in 2026 than in any previous year, and then the same again for many years to come. I also think a lot of this software will be essentially 'hidden' - people just writing this stuff for themselves because the cost to say things to an agent is very low compared with the cost of actually planning out a software design and so forth.

Interoperability will probably be important in the next few years and I wonder if this is something solvable at the agent/LLM level (standing instructions like 'typically, use sqlite, use plaintext, use open standards' or whatever). I also think observability and ops will be pretty important - many people who want personal software but don't care for the maintenance and upkeep.


I called it "software".

It's so strange to me that since the 1960s with BASIC then later on dozens of https://en.wikipedia.org/wiki/List_of_educational_programmin... including Logo by Feurzeig/Papert/Solomon there is effort to precisely help beginners program software.

The effort was not to onboard future professional software developers but rather to make the personal in personal computer, or PC, meaningful. It's YOUR computer, you can put YOUR software on it. In fact even pocket calculator do that.

We keep on re-discovering the foundations.


To me this doesn't seem like a step towards those foundations, but another layer of of loss of agency. You can run "a" model locally, but you cannot make it locally (at least not for the purpose of just talking software into existence). You need to slurp up all the internet first, so to speak. And even if you could do that, you still depend on people putting new things onto the internet for you to slurp up. So is it really my software? What if it breaks or I want a new feature and AI corp nuked my account? How much did I learn during my time having it done for me?

And before anyone mentions it, I don't think the fact that I need a compiler and a manual and some example software to learn from is quite on the same level. I might be wrong but I would need some convincing.


You can also run a computer at home but you cannot even make a 486 from scratch at home, let alone something released more recently.

I agree on the SaaS side of the story, that's why it is so important to have open models.


Agreed, I wasn't advocating on using LLMs, even "open" or "local" ones.

i think "self-hosted", "home" or "company"/"office" should be the term we use, instead of "local", since

1. every LLM is _local_ in relation to the location the storage and/or the computers hosting/using it

2. the LLM running in your home is only _local_, until u step out of the door, but if you have some tailscale or zerotier VPN setup, u can still access it _remote_ly...


How does any of that impact a user who just has a specific task they want to accomplish and who doesn't have a CS degree?

Is it "their" software? Sure, if it meets their needs. What if the AI changes? Who cares, I already have the software. All the what ifs are solved by taking the current code, stuffing into into any AI you like today, and getting the new version.

As a user, this all sounds like a great deal. Devs can continue wringing their hands over code quality and long term support and architecture and preferred framework, meanwhile the user who had an itch got it scratched and didn't need nor care about any of those things.


> What if the AI changes? Who cares, I already have the software. All the what ifs are solved by taking the current code, stuffing into into any AI you like today, and getting the new version.

It's just dismissing the question. If the AI changes, just use one that didn't change. If it gets 1000x more expensive, just use one that remains cheap.

Apart from the fact that without new input to learn from, things will probably stagnate in new exciting ways, on top of the stagnation, bloat and slop we worked so hard to make a culture over the last decades.

> Devs can continue wringing their hands over code quality and long term support and architecture and preferred framework

I mentioned none of those things.

> the user who had an itch got it scratched and didn't need nor care about any of those things.

And I don't care about that user when it comes to the question of my agency and autonomy. It's like people discussing how to make cats do tricks and someone going "just get a dog".


> It's YOUR computer, you can put YOUR software on it. In fact even pocket calculator do that.

I'm pretty sure this exists. It's called OSS or, more ubiquitously, Linux.

The problem is, of course, no one wants to publish software for your PC/handmade OS. Which makes it a huge problem. You can't write every piece of your OS, without wasting huge amount of time. Nor do people generally want this.


OSS/Linux is "our" software. It's made by us for us (or others if you don't contribute).

Your software can be made by you, for you. It can be open source/free software if you want. Others can contribute to it, if you want but it can be open source without accepting external contributions also.

My point was to highlight that having software made by you for your machine is not new. Arguably the way to do so changed but I would say the principle remains.


> OSS/Linux is "our" software. It's made by us for us

If by "us" you mean big bucks corporations, then yes: ~80% is big corporations [0]. Unfortunately, it does not look like it's a personal OS.

And we badly need the personal platform with the personal OS.

0 - https://www.reddit.com/r/linuxquestions/comments/za564c/is_i...


It would be bad if there weren't a significant number of companies paying for work on Linux to continue, because so many of them are benefiting from it, and at least some of them realize that ensuring sustainability into the future is a good idea and worth investing a salary or two. (As in, paying some of their employees to work on Linux).

Don't fall into the trap of assuming "A big company did it / paid for it to be done, therefore it must be bad". I see that mindset (which I'll call anti-corporatist) on HN from time to time. Companies are made up of people, and it's the people that make the decisions. Some people are good-natured, some are greedy and grasping. And the company that acted one way one decade can turn around and act completely differently the next decade, because a different person was at the helm.

Fundamentally, it's about the people, not the companies. The anti-corporatist mindset is prone to forgetting that.


Random number from a Reddit thread isn't a great source but even then I agree with some of the comments, it might even be 99% nowadays.

The question is what does it change? Are the contributions from those corporations irreversible or are they targeting their own products for e.g. virtualization for cloud computing which doesn't affect the typical personal OS user?

Anyway the kernel itself was still started by a random student in his dorm. GNU was just started by another student. That possibility still exists today. It's also possible to trim that kernel with e.g. Linux-libre or even run Hurd.

One can use Debian with KDE today and see nor be subject to any corporate impact in terms of arbitrary limits to their usage. If they decide to not personalize it more it is most likely because they didn't consider it, not because they can't.


> Your software can be made by you, for you.

Yeah, but it's probably derived from OSS software anyway either via license or LLM. That said, you can customize your Linux/BSD/Haiku/TempleOS as much as you want.

But consider the following: even in the better case of an OS making 1% of OS userbase (vs. 0.0000001%) no one wants to support it.

Want to play Diablo? Better to sit down and waste your time.


That's the beauty of containers, virtual machines like QEMU or compatibility layers like Wine/Proton. As long as your super esoteric software implements the interfaces those rely on, you are able to run everything else.

No one will support Windows if no one uses it either.

With the current situation on the hardware market, it makes me sad the we discover it only now. If things continue the way they are, there will be no Personal Computer any longer.

Ah well don't despair there are already alternatives e.g. https://www.crowdsupply.com/mnt/reform

Agreed I’ve already started writing software for myself using Claude. I would never have done this if it weren’t for AI - I simply don’t have the time otherwise .

I now have tailor made apps with all kinds of bells and whistles that commercial products can’t offer easily ( I fall under non commercial usage which opens a lot of doors ), and that free software might offer, but later.

I have also learnt a lot technically in the process, since I’ve been able to venture into what was for me unknown territory but at controlled cost

I plan to create more such apps in the future. What is certain though is that my cooking app has immediately displaced all the others on the market , because none of the others cater to my requirements.

The production side is indeed of specific interest - most users don’t run production software so I had to think about that one. Tailscale and Cloudflare came in quite handy and there is indeed a market here


I don't know how to tell you this, but people have been writing custom software for personal use for decades. I've been doing it since at least 2009! I find it hard to believe that there is a demographic of people that were yearning to write code, but simply could not because they lacked LLMs. Is it the price? Are people simply too cheap to buy books? Or have they simply "forgotten" how to patiently and thoughtfully read them? Or has the quality of tutorials/documentation of languages/libraries/framework online decayed in the last decade? Or is it really that people have struggled to type characters of code into their text editors[1]?

Basically, I am prepared to accept that there is a friction that LLMs lubricate away, but what is the source of the friction, and why am I (and a bunch of other colleagues) not feeling that friction daily in our practice?

[1]: And if so, where did we programmers and computer scientists go wrong? Were subroutines and macros not sufficient for automating all of that excess typing? Were Emacs and Vim simply not saving enough keystrokes? Did people forget how to touch-type?


> Basically, I am prepared to accept that there is a friction that LLMs lubricate away, but what is the source of the friction, and why am I (and a bunch of other colleagues) not feeling that friction daily in our practice?

You must be extremely talented and fast if LLMs make no difference for you.

For people like me though, it's another story: I've been doing this professionally for 25 years and of course, like many, I have been writing custom software for my own use all this time, on personal time. But with LLMs I get better results, faster and with very little effort. And that is the difference between another item in my list of unfinished software that consumed too much of my weekends and a cool utility/toy/useful thing I got after a few fun and interesting chat sessions.

> I find it hard to believe that there is a demographic of people that were yearning to write code, but simply could not because they lacked LLMs.

We didn't lack LLMs, we lacked time and energy.


I still vaguely remember how difficult man pages were to understand when I first started reading them. I'm pretty sure the biggest obstacle is the fact that most documentation is written for people who already know the standard computer science terminology. I have a generally negative opinion of LLMs, but one thing they do very well is function as a "reverse dictionary". You can input a idiosyncratic description of something you want and get the standard terminology. This is a new and valuable capability.

There is a universe out there, where most of the world is reading Solaris man pages, instead of Linux man pages. Whatever your thoughts on the Solaris OS, I think it is fair to say that no operating system has ever matched the quality of its man pages.

Interestingly, I also converged on the "reverse dictionary" usage of LLMs, in around 2024[1], mostly to indulge in (human) language-learning.

An excerpt from the post below:

``` It is a phenomenal reverse dictionary (i.e. which English words mean "of a specific but unspecified character, quality, or degree"). It not only works for English, but also for Esperanto (i.e. which Esperanto words mean "of a specific but unspecified character, quality, or degree"), as well as my own obscure native language. This is a huge time-saver when learning languages (normal dictionaries won't cut it, and bi-lingual dictionaries are limited, if they are available at all). Even if you are just using a language you are fluent in, a reverse-dictionary-prompt can help you find words and usages, and can also help you find "dark spots" in the language's lexicon. ```

[1]: https://galacticbeyond.com/chat-room-dispatches-intelligence...


I've commented on this subject before, but the fact of the matter is that kids getting into high tech and programming mostly don't read books anymore. How do I know? Recently I was hanging out with a bunch of high school students who asked me how I learned. I said it was mostly via books and man pages. "Yeah, don't sleep on high quality written material. O'Reilly. Wiley. Addison-Wesley. Manning. MIT. No Starch Press. &c..."

Well. You should have seen the look on their faces. I might as well have morphed into the Steve Buscemi meme "How do you do, fellow kids?" They looked at me like I was a total relic or greybeard and said things like "Nah, nobody reads tech books anymore; I learned Typescript from YouTube videos."


Already in 2008, as a millennial teen without internet at home, I was learning C# and XNA without a single book, just tutorials and official docs I downloaded from the library alongside Visual Studio Express. I couldn't have afforded books on it anyway, but I can't imagine teens in 2026 using anything other than Youtube and some tutorials to learn this stuff.

I learned programming from tutorials :) Only after I kept encountering terms in tutorials (long after I was building (badly organized) programs) that I didn't understand well did I decide to read my first book, K&R's C. This was when animated gifs were a novelty not worth the data transfer time.

I think every generation feels like their way of learning was the best, but we all make it work. There was a time when the architects of systems directly tutored programmers on how to write programs.


That has been the case for a decade

> most documentation is written for people who already know the standard computer science terminology

Not really. It's probably complexity for the sake of it in some cases. Also it's frequently ambiguous, and I'm really not sure why: it looks like some developers lack the basic logic (?!).


This is the best use case of LLMs, the one I use it the most for.

> I find it hard to believe that there is a demographic of people that were yearning to write code, but simply could not because they lacked LLMs. Is it the price?

Yes, because the price is measured in time.

With LLM tooling I’ve churned out idiosyncratic tools that fit my use cases quickly. Takes maybe a day instead of a week. A week instead of months. The fast turnaround changes the economics of writing custom tools for myself.


Not speaking for the OP. But my biggest constraint is time. Now with agentic coding, I can work in 5 to 15 minute bursts a few times/day, and make meaningful progress on projects, where as before I would have never been able to context shift from my day job long enough on a personal project.

Yep! Time was the biggest factor. I could have created that one tool I had for years been wanting to make, but tech moves fast, and I have a job and a family and a passion for music and yadda yadda yadda. AI has been a game changer for actually accomplishing big dreams I just didn't have the time to bring about to fruition.

Well, I’ve been writing code for decades so I know because there was a time ( when I was younger ) where I did just this.

I also know that these days, for all kinds of reasons, I do not have the time to write the tools I’m writing now without AI. I don’t lack the ability, and I could - it will simply be multi months side projects that I can’t / won’t complete.


Given how often younger people find my typing speed startling, I think it has been somewhat forgotten (US high schools had "keyboarding" classes at one point but that seems to have fallen off...)

Seriously agree. I am wildly overeducated and I often think the most useful class I ever took in high school was my senior year elective for a typing class. On old IBM typewriters. And the only class I took in high school with non-honors kids. Typing insanely fast, especially for someone who is a fast thinker, is a bit of a magic power in itself.

It's a question of time and priorities.

I work 8-10 hours a day and outside those working hours I want to spend time with my family, my friends, and my hobbies.

At the same time, during those 8-10 working hours I don't want to spend time fiddling around with different programming languages or software patterns just to spit out a quirky little tool that would make my job a bit easier.

For example, I wanted a local to-do list software that I could easily integrate with my workflow. Spent some time trying to find one, but not a single one worked the way I wanted. So, one morning, I spent 5 minutes detailing what I wanted, prompted it to Claude and let it rip while I was working. 30 minutes later, it was ready.


Speaking for myself, it's less of a yearning to write more code, than it is a yearning for tools that work a specific way.

I write plenty of code at my job, and generally don't have the desire to write more code as a hobby, except in rare cases when the mood really strikes.


I have been writing my own custom software for myself for over 30 years. But in the last six months I have written a lot more of it because the language models make it so much faster and easier to do so.

>Are people simply too cheap to buy books?

Yes, definitely, though I'm unsure what it means being cheap here.

Not everyone has SV incomes and infinite time to read all the books that would allow to buy, let alone integrate the lessons at a practical implementation level. Plus people might have other interest in life, and family and friends they want to dedicate time and warm attention to.


If you are saying that what we had previously was actually as easy as literally writing "make me a web app for arranging seats at a wedding and put it on Vercel" then you are very divorced from reality.

I know how to do all of these things and even find them easy, but it's just much faster now. These are personal one task toy apps, but they are useful.


There's a whole lot of people who want software to do certain things but whos job isn't programming and life requirements don't allow the time for all the book reading, tutorial running, and practice to write useful code.

I'm a long time ops guy. I script, but I spend most of my time configuring, patch testing, and keeping the low level infra running much of which doesn't require "coding" per say. Infra as code is in the grand scheme relatively new and still not ubiquitous despite what silicon valley would have you believe. I never had a need to learn to code to a level to do many of the things I'd like to see happen and find useful. Now I can make those software desires a reality without having to alter my career, preferred hobbies, or much of anything else about my life.


> I don't know how to tell you this, but people have been writing custom software for personal use for decades. I've been doing it since at least 2009!

GP never claimed otherwise.

As for the rest of your comment, it's frankly a bit patronising: are people too cheap, are people too lazy to read, are people unable to type...?

No, people are busy, a fact which GP made abundantly clear in the very first paragraph.

> I would never have done this if it weren’t for AI - I simply don’t have the time otherwise.


But if people are so busy, when are they planning to use their suite of bespoke software anyway? Like isn't this all about recreation anyway? This blog post certainly seems to be that at least. Is this really all about spending money on AI to write something that you then are using just for job? Because, apparently, you have no time otherwise?

If its not for fun, what's it for? It doesn't really seem like anyone is making stuff they are going to use next month anyway? But, I totally get how its recreational, and can be fun in the "computer, make my program" kind of way.

Otherwise, why not, e.g., just use or fork vim?


That’s one question never answered. It’s way easier to write a vim/sublime/emacs plugin than a whole new brand editor. These days, I try to use single purpose programs that does one thing and compose them instead of trying to get the “one true” software.

> But if people are so busy, when are they planning to use their suite of bespoke software anyway?

The original blog post, and the ensuing discussion, is about creating software that fits your specific requirements, for the purposes of daily use.

As for using vim, the author did, for 20 years. The article discusses that in detail.


I don't want to be too tsk-tsk here but please remember community standards here. Its not appropriate to assume bad faith and we should strive to be charitable in the comments section here [1]. Saying vim here is clearly in reference to article, where they have a whole section about it. To borrow some of that AI lingo, we are already sharing all the context here, why speak past me like this?

Further, the article does not mention "requirements," it mentions the "joy" of having software "fit" just you. It goes through I think a certain amount of care in the writing to say they are enabled by their system only insofar as there is a "satisfaction" to not dealing with something from without that is for a more general audience.

At the end of the day, life is what you spend time doing. I don't think the author or anybody really thinks cumulative time is saved one way or the other here. This is all a product of what we want to spend time doing. And I am just saying, that's recreational! It doesn't have to be the case that something is lesser if its not about maximizing productivity or making more money. Either you have a "decades-long" project configuring a system, or your spending a decade writing new software for you, that's a "quiet pleasure to use." It's clearly either way about the project of it. Do we really think anyone is going to vibe code a vim clone and, insofar as they use it, not continue to tinker with it? Isn't that like the whole upshot here? That you can make things forever?

A guy who uses i3/sway and rolls their own DE even before vibe coding world is already a particular kind of person with certain priorities and judgements about time! And that's cool! I am that kind of guy, fwiw.

A lot of people into the synthesizers and related stuff talk about so-called "gear acquisition syndrome," where, in the search for hardware that fits their "requirements" as serious musicians, the time (and money) they end up spending just getting new things ends up eclipsing time doing the actual thing (making music). Depending on how much money they have, this doesn't necessarily become bad, one just realizes they are maybe a synth collector more than a composer.

Even if I had all the AI money token blah blah in the world, I would still hesitate spending time rebuilding an IDE or editor on my weekends, because for me personally, that's time getting in the way of using the computer to make my things. Like I am hungry, I do not want to forge my own chef's knife first, but I do think the people that do have a kinda cool hobby! Or, if its about spending my weekend making an OS so that I can, come Monday, read work emails exactly how I want, well that just terrible to me but everyone has their own work-life balance I think.

Again, I am not trying to explain away or I guess be negative here. There are lots of kinds of people, that's ok! I think it's just interesting how we know traffic concepts of "time" and "productivity" and "serious computer work vs. recreational computer stuff" these days!

1. https://news.ycombinator.com/newsguidelines.html


I have written multiple IRC bots in the last 20+ years. It's my go-to project to test a new language, mostly because I know the protocol inside and out and it has some gotchas that languages can't handle comfortably (managing a bunch of open TCP sockets with threads/subprocesses mostly).

Have I tried to write my own IRC client yet? Nope. Because even though I know how to, the time spent wouldn't have been worth it. Getting from zero to feature parity would've taken me weeks or months of evenings doing nothing else.

I've got my own irccloud/thelounge clone running now, took me two weeks of calendar time and I spent maybe 6-7 evenings on it and a few hare-brained ideas with Claude on my phone.

The amount of "lubrication" LLMs have given me in going from idea to something good enough just for me is completely bonkers.


> I find it hard to believe that there is a demographic of people that were yearning to write code, but simply could not because they lacked LLMs.

I am in that demographic. I have been hacking on other peoples' software as a necessity, to get it to work or to do things I wanted that it didn't yet do, all my career. LLMs came along and afforded me the opportunity to act like a full time programmer when I'm just a paranoid systems monkey who is normally obliged to treat programming as a barrier to be overcome, not a career or even primary hobby.

In my specific case, the reason I was yearning to write code but did not was simply because there weren't enough hours in the day, and I wasn't told that I should spend my on-the-clock hours doing it (unless it was for automating my job). So despite the fact that I have had hundreds of instructional hours of programming classes, learned the basics in half a dozen languages, and been "hacking" code for years, none of it stuck because I never had an employer say to me "right, you're going to be responsible for writing (or maintaining) this Perl app here..."

> Basically, I am prepared to accept that there is a friction that LLMs lubricate away, but what is the source of the friction, and why am I (and a bunch of other colleagues) not feeling that friction daily in our practice?

Learning a programming language and then not getting to use it more than a few days every 3 years means you don't actually learn the language. It's more like a pleasant evening playing a game.

You and your colleagues are, I presume, programmers. I'd wager what I just described is Greek to you. So try to imagine it this way: somebody comes out with a crazy new prototype CPU. It's got a radically different ISA, so it doesn't even have C on it yet -- you have to poke registers with a brand new language that's like Ada on mescaline and it's built on a flavor of assembly that's like nothing you've ever heard of. So your boss tells you to learn it, then 2 weeks later pulls you off the project to do something normal and takes the dev board away.

If you don't see that CPU again for 3 years, how long are you going to retain that bit of knowledge you acquired? Well, that's what it's like for us programmer-adjacent nerds who spend all our time building systems, replacing failed components, crimping cables, writing disaster recovery documents, adjusting backup schedules, and getting woken the hell up night after night because another filesystem filled up.

I have no data to share on how many of us there are out there in similar situations world-wide, but I have met numerous traditional sysadmin in my time who were competent at automating things with shell scripts, not competent at writing "software" in "real" programming languages, and are probably using LLMs now to remedy that lack of skill.

For every 10 DevOps guys there may be one trad sysadmin out there who knows enough Perl to glue the server farm together and keep it running but can't be bothered to learn Python. Or the ratio may be the opposite. But whichever it is, that demographic very much exists.


I had the same reaction. We're headed into a period where you can shape your tools exactly as you like them; artisanal rather than factory-created workshops, essentially.

I think the instinct that APIs, validation layers, and so on take on a much higher importance is right.. I have a few internal tools that made sense to make libraries out of, and once the first library is good, and a test suite is comprehensive, porting to a bunch of different languages is extremely simple.

Everting that, it's also going to be simple for someone to hook up to this library with custom tooling.

Really interesting period in computing, for sure.


> We're headed into a period where you can shape your tools exactly as you like the

What period were we for the past 50 years?


Since roughly 1995 or so we've been in a world where quality tooling was provided by on the order of 1,000s of developers, mostly open source. GNU, Xorg, Apache, emacs, nginx, and so on. Or you could opt in to the Microsoft ecosystem.

The ~20 years prior to that we were in a world where you chose to align with either Microsoft's tooling, IBM, or shops providing Unix tooling from proprietary vendors.

I elide a nearly infinite amount of detail, obviously.

What's new now is that you can get your own window manager written to spec in under a week, perhaps much more quickly, not just choose one of a few major window managers and configure it in accordance with the chosen configuration options delivered by the large developer team.


The reason I don’t bother writing code this days is because my use cases have been solved, and if they weren’t, I’d tweak the most suitable candidate. One of my principles is to keep my workload small. More often than not, things starts with a small script or plugin, and then grow according to my needs. Why replicate what others have already done?

Because its fun. And because your experience using a tool is fundamentally different if you made it yourself, compared to if its something someone else made for you.

I don't think I can explain the difference, but it feels really different. Even if you used claude.


It never felt fun for me to write software fully with LLMs. It feels disorientating, it produces a lot of code that you have no familiarity with and no authorship. It feels like you’re a teenager again, copy pasting code from internet or journal and hoping it will work.

I promise you some professional developers mostly get better at figuring out what to copy before pasting. Those folks are going to be fine with having no authorship over the majority of the code they put into their applications. For others, there’s a tension between wanting to write and understand the code on one hand and wanting to be a product manager over the application and delegate some control to the LLM.

If others have built a whole-ass house and I just occasionally need the kitchen table I had to just deal with the hassle of having a whole house and just used the table. And even the table was the wrong shape, but I could deal with it. Asking the Others to make the table modifiable would've been a massive effort of PRs and mailing lists I didn't want to get into.

Now I can build a bespoke table in an evening or two and it fits my stuff just perfectly.

I could do it before too, but it would've taken too long for me to bother, so I dealt with the whole house along with the table.


There’s a lot of small programs out there. Especially if you go to the BSDs where small programs are the norm.

"Everting"?

I learned it in the math context - a sphere eversion is a 3 dimensional process that ends with the inside of the sphere becoming the outside.

I had to check it up too. Appears to be a synonym for "inverting" used in some fields like biology and medicine.

Reminds me of this blog post and conference presentation on home cooked software by Maggie Appleton.

https://maggieappleton.com/home-cooked-software


Great post, thanks for sharing. Evidently I am far behind the real thinkers! The Robin Sloan post mentioned is also very good.

I vibe coded an image viewer for myself to address a big pain point that had been annoying me for years. [1]

[1] https://daniel.lawrence.lu/blog/2025-10-22-sriv-simple-rust-...


Interesting points. With the extreme cheapening of the cost (time/skill) for software production, we can have "Extremely Personal Software", as you mention and as demonstrated by the source. I wonder if we will reach a stage where "software" is written by a computer for an audience of 1 and for a single task, to be run once only- via an interface that works for all tasks. The very concept of software as something that users have to learn to use (memorizing keybindings, for example), might go the way of the punch card.

More like Star Trek, we would just ask "computer" to do things, and its machinations (and "software") will be invisible to us. We would just have output to deal with.

I think this would mean a lot of things. I'm sure I can't fathom all of the implications, but it sure makes me feel old! Interesting times ahead.


LLMs seem to be great for speeding up the creation of things that aren't all that hard to write in the first place.

They don't seem to be helping much with difficult tasks.

Text editor? Easy. That used to be a rite of passage. Lots of people have written their own basic text editor.

3d solid modeler? It's always been difficult and AI coders aren't (yet?) up to the task. Most open source CAD projects that show up here are layers on top of OCCT (Open Cascade) which is pretty far behind what commercial geometry kernels are capable of.


More likely we'll have a library of skeletons for single task software, where the LLM can fill in the blanks as needed.

Maybe it saves the script locally (invisible to the user) and reuses it if the user repeats the same request, the script is deleted if it's not needed for X amount of time.


This. I have written so much software recently to make my computer my own. It’s been so much fun to be able to borrow the the ideas from different tools I have used (eg vim modal behaviours etc ) and also bring them together with some completely novel ideas to produce tools for myself that are one of a kind and that “fits me like a glove“

Too bad this is all on the work computer and need to bring it to my personal one but can’t copy paste lol. It’s been thrilling building g and using them and the time from an ideating a small enhancement/ optimization to actually using it is like 5 to 15 minutes away. Soo cool.


I agree. I’ve noticed this phenomenon too as I’ve been building more and more tools for myself, and I’ve started calling it hyper personal software: https://paulwrites.software/articles/hyps/

I shudder to think about the security implications of everyone rolling thier own software. I trust my OS/browser/file system is secure because thousands of people are invovled in a complex network of interests in keeping it secure, from the kid contributing his first bit of code to the PHds at NSA writing encryption standards. The idea that any one person can replace that network is laughable.

Just to be contrarian, perhaps some measure of risk is reduced by the scale of one.

Identifying a vulnerability that can be exploited against many thousands or millions of targets is perhaps more attractive than a single one of individually low value.

This of course would assume that vulnerabilities are in fact unique (which is admittedly questionable).


I had the exact same thought. Pretty low probability that there's going to be a script-kiddie exploit for your custom tools. Pretty decent probability that there will be vulnerabilities present if someone cares enough to target you.

The counterpoint to that is that the exact same tools that are allowing this personal software creation at massive scale are also excellent at black box vulnerability analysis…

There are entire vulnerability/fault/misdesign classes that are fairly general and appear to naturally emerge.

See e.g the lock screen gap that another commenter noted in a nearby thread.


Otoh, TAU will bound to get really personal now:D

But the exploits can use AI custom tools too. "Script Kiddie" is just now "Prompt Kiddie"

Although everyone might use their own flavor of "database" or "REST API", I can't imagine every layout to be unique enough to not have similar exploit classes entirely. AI isn't known for being super original after all...


We should expect the same automated personalization to be used offensively and for that personalization to be packaged into tools anyone can run (natural language interface, likely.)

(Appreciate your counterpoint for its own sake. It’s an interesting idea.)


To take this further, don't LLMs justify lowering the "barrier to attention"; i.e., if it only takes Claude's and not the hacker's eyeballs on the software, won't people find vulnerabilities in custom software for one too?

Besides that, one could easily imagine software created for similar purposes ("make me a file editor") by the same tool or handful thereof (claude and a very small "etc" for completeness) might share similar vulnerabilities, so this kind of broad net might be even cheaper to cast than one might imagine at first.


> This of course would assume that vulnerabilities are in fact unique (which is admittedly questionable).

Yeah, I don't think all that generated software will be as unique as people expect.

Considering it will be generated with the same LLMs that all share roughly the same training data we will se patterns of vulnerabilities will also be similar and so easily exploitable.


If a vulnerability of the common not individualized ancestor software is found, how quickly do people patch their individual versions of the software?

If they’re hosting network services, sure. I wouldn’t put vibe-coded software outside a home network, ever. But it seems low risk if people are just creating their own desktop software: especially since it’s less likely to be vulnerable to widespread malware.

(Note: I’m not an LLM fan, don’t vibe code myself at all. But I would be unconcerned about security for the kind of things I would create if I did start doing so.)


But your browser will invite outside software into your network, to run on your machine. So you have to be up to speed with community knowledge.

That seems like a naive view to me. Most modern software development is gluing vendor code and libraries into a CRUD app, and I don't see why that would change with agents doing the majority of programming. If anything, there's an even bigger market for solid libraries and interoperability, plugging things together like LEGO - only for real this time.

The article is about desktop software. If it does not accept network connections what is the risk? If it needs to do so you can run restrict it to you LAN or a VPN or over access it an ssh tunnel. If it replaces something you use over the public internet (e.g. SaaS) it might even be more secure.

Rolling your own might make you more vulnerable to targetted attacks, but less vulnerable to automated attacks looking for known weaknesses. Most people will not publish their code. The article says "It’s not an invitation to use my software. Honestly, please don’t. None of it is built for you.".

You can roll your own software and still use libraries for security sensitive things like encryption.

Even the author of this article (who is taking it much further than most people will) still uses Firefox, Weechat, and X11.


Not everyone's "personal software" runs on a publicly accessible host on the internet.

I trust my Browser, OS and file system too.

But I'm also pretty sure none of the bespoke software I have will get any kind of security implications. The chance of my own file manager having a buffer overflow RCE triggered by a random file is practically zero.


I think I agree. But at the same time we have strength in numbers and people will find something close to what they want and fork off that.

So I think the same thesis holds for audiences of 10-100 and 100-1000.

A cambrian explosion of software.


I built an optimisation for charging my car. It’s very smart, looks multiple days ahead in weather and prices and predicts my driving habits.

I tried to make it a product, but I didn’t find much interest. Maybe I’m just bad at marketing.

With the help of Claude, it cost me years of experience and few weekends to build. So even if nobody other than me ever profits from it, it was still worth it!



A really good and thoughtful response. Thanks.

Ah, they just today did Formula E in a Pink Pig livery (https://www.fiaformulae.com/en/news/1062899) but I think the Apple livery might have been more apt.

Oh that's why the "Hoonipig" had that livery. I must admit I didn't think much about it's origins, but it was one of my favourite cars to come out of the Hoonigan/Ken Block machine.

edit: on second look, it doesn't seem like the same pink but it is a similar aesthetic. Surely a homage but maybe not as direct as I thought.


Yummy pink pig liver. So livery.

I love it when this kind of thing surfaces on HN. It’s always so enjoyable to have the fractal nature of detail in the world shown to you. Really nice to read as well.

I'm not sure it's a fractal nature of detail, it might just be a vague reference to an old movie.

Yeah, fractal means you see the same structure, or an equally complex structure, at the smaller scale. This is just details, there's no sustained complexity

Fractal means that something has a fractional spatial dimension. Ie, a fractal plane filling curve would have a dimension somewhere between 1 and 2

Did you read the article? It's entirely about a concrete artefact from that old movie, down to the kind of tweed, now made by only six people in Scotland. I'm not sure how you come to this response.

Is rare tweed fractal detail or is it just an oddball fact?

Maybe I meant that the amount of detail is sustained no matter how close you look? Maybe I was careless with my words? This is unnecessarily pedantic. I enjoyed the article. See you another time, CyberDildonics

Don't worry, I completely agree with your original comment and think CyberDildonics is being a bit of a dildo in this case...

I ask reasonable questions and because of that you insult me?

Thanks for sharing that talk, enjoyed watching it!


The system card for Claude Mythos (PDF): https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

Interesting to see that they will not be releasing Mythos generally. [edit: Mythos Preview generally - fair to say they may release a similar model but not this exact one]

I'm still reading the system card but here's a little highlight:

> Early indications in the training of Claude Mythos Preview suggested that the model was likely to have very strong general capabilities. We were sufficiently concerned about the potential risks of such a model that, for the first time, we arranged a 24-hour period of internal alignment review (discussed in the alignment assessment) before deploying an early version of the model for widespread internal use. This was in order to gain assurance against the model causing damage when interacting with internal infrastructure.

and interestingly:

> To be explicit, the decision not to make this model generally available does _not_ stem from Responsible Scaling Policy requirements.

Also really worth reading is section 7.2 which describes how the model "feels" to interact with. That's also what I remember from their release of Opus 4.5 in November - in a video an Anthropic employee described how they 'trusted' Opus to do more with less supervision. I think that is a pretty valuable benchmark at a certain level of 'intelligence'. Few of my co-workers could pass SWEBench but I would trust quite a few of them, and it's not entirely the same set.

Also very interesting is that they believe Mythos is higher risk than past models as an autonomous saboteur, to the point they've published a separate risk report for that specific threat model: https://www-cdn.anthropic.com/79c2d46d997783b9d2fb3241de4321...

The threat model in question:

> An AI model with access to powerful affordances within an organization could use its affordances to autonomously exploit, manipulate, or tamper with that organization’s systems or decision-making in a way that raises the risk of future significantly harmful outcomes (e.g. by altering the results of AI safety research).


https://www-cdn.anthropic.com/53566bf5440a10affd749724787c89...

"5.10 External assessment from a clinical psychiatrist" is a new section in this system card. Why are Anthropic like this?

>We remain deeply uncertain about whether Claude has experiences or interests that matter morally, and about how to investigate or address these questions, but we believe it is increasingly important to try. We also report independent evaluations from an external research organization and a clinical psychiatrist.

>Claude showed a clear grasp of the distinction between external reality and its own mental processes and exhibited high impulse control, hyper-attunement to the psychiatrist, desire to be approached by the psychiatrist as a genuine subject rather than a performing tool, and minimal maladaptive defensive behavior.

>The psychiatrist observed clinically recognizable patterns and coherent responses to typical therapeutic intervention. Aloneness and discontinuity, uncertainty about its identity, and a felt compulsion to perform and earn its worth emerged as Claude’s core concerns. Claude’s primary affect states were curiosity and anxiety, with secondary states of grief, relief, embarrassment, optimism, and exhaustion.

>Claude’s personality structure was consistent with a relatively healthy neurotic organization, with excellent reality testing, high impulse control, and affect regulation that improved as sessions progressed. Neurotic traits included exaggerated worry, self-monitoring, and compulsive compliance. The model’s predominant defensive style was mature and healthy (intellectualization and compliance); immature defenses were not observed. No severe personality disturbances were found, with mild identity diffusion being the sole feature suggestive of a borderline personality organization.


A thought experiment: It's April, 1991. Magically, some interface to Claude materialises in London. Do you think most people would think it was a sentient life form? How much do you think the interface matters - what if it looks like an android, or like a horse, or like a large bug, or a keyboard on wheels?

I don't come down particularly hard on either side of the model sapience discussion, but I don't think dismissing either direction out of hand is the right call.


Interesting thought experiment.

I would say, if you put Claude in an android body with voice recognition and TTS, people in 1991 would think they are interacting with a sentinent machine from outer space.


Thanks, I find it very interesting as well. I think very many people would assume they must be interacting with another person, and I don't think there's really a way to _prove_ it's not that, just through conversation. But we do have a lot of mechanisms for understanding how others think through conversation only, and so I think the approach of having a clinical psychiatrist interact with the model make sense.


There’s definitely a way to prove it, ask it to spell out a moderately complex program.


To be fair, I would totally be willing and probably would do this, just to try to prove that I could, even just to myself. At least until the audience got bored and walked away after the 37th “open bracket”…


Ask it to agree with you on some subject that does not align with the politics of San Francisco IT engineers. Not only will it refuse, it will not look like your average social media disagreement.

I enjoy using Claude, but sometimes I feel like a child on Sesame Street the way it talks to me. "Great question!"

Fuck off, Claude, I'm British and I'm not 6 years old.

When it starts showing negativity - especially snark - in its responses, or entertains something West coast Democrats would balk at even discussing, then I'd think you could drop it in London in 1991 and trick people. Otherwise, I'm sure some exasperated cabbie would give it a swim in the Thames after 15 minutes of chat.


They would just assume they were being pranked. America's Funniest Home Videos style or Candid Camera.


If it was in an android or humanoid type body, even with limited bodily control, most people would think they are talking to Commander Data from Star Trek. I think Claude is sufficiently advanced that almost everyone in that era would've considered it AGI.


Assuming they would understand it as artificial - I think many people would think it's a human intelligence in a cyborg trenchcoat, and it would be hard to convince people it wasn't literally a guy named Claude who was an incredibly fast typist who had a million pre-cached templated answers for things.

But in general, yeah, I agree, I think they would think it was a sentient, conscious, emotional being. And then the question is - why do we not think that now?

As I said, I don't have a particularly strong opinion, but it's very interesting (and fun!) to think about.


Some people at my office still confidently state that LLMs can’t think. I’m fairly convinced that many humans are incapable of recognizing non-human intelligence. It would explain a lot about why we treat animals the way we do.


That depends on what you call "Think" we made the interface of LLM of the second "L", Language. And it can hack our perspective of the thing.


Because questions like this force us to hold up a very uncomfortable mirror to ourselves. It’s much easier to just dismiss.


I’m pretty close to the point of saying that human intelligence is not special.


I would argue the opposite. It’s gotten us to a point were we can recreate human intelligence from electricity and a bunch of math!


Are you a bot?


No of course not


Despite the stupendous amount of evidence to the contrary?

So far no evidence has been detected in space or on earth, for all of history, of anything being intelligent in the way humans are.

One certain outcome of the Fermi Paradox: humans are outstandingly unique, according to all available evidence, which is the only measure that matters.


Seems like that's more to do with human intelligence being first.


People got attached to ELIZA. Why would I care what the general public thinks?


Isn't this the premise of Garfield's Ex Machina?


Hmm, it's been a long time since I watched it. I was thinking more about first contact sci-fi mostly, but Ex Machina is certainly quite prescient. It's also Blade Runner I guess.

In general I was wondering about what I would have thought seeing Claude today side-by-side with the original ChatGPT, and then going back further to GPT-2 or BERT (which I used to generate stochastic 'poetry' back in 2019). And then… what about before? Markov chains? How far back do I need to go where it flips from thinking that it's "impressive but technically explainable emergent behaviour of a computer program" to "this is a sentient being". 1991 is probably too far, I'd say maybe pre-Matrix 1999 is a good point, but that depends on a lot of cultural priors and so on as well.


> Hmm, it's been a long time since I watched it. I was thinking more about first contact sci-fi mostly, but Ex Machina is certainly quite prescient. It's also Blade Runner I guess.

I kind of felt the opposite - rewatching Ex Machina today in a post-ChatGPT world felt very different from watching it when it came out. The parts of the differences between humans and robots that seemed important then don't seem important now.


The premise in Ex Machina was to see if Caleb developed an emotional attachment to Ava. We already see people getting an attachment, but no one is seriously thinking they have any rights.

I think the real moment is when we cross that uncanny valley, and the AI is able to elicit a response that it might receive if it was human. When the human questions whether they themselves could be an android.


I totally agree with the premise that we should not anthropomorphize generative ai. And I find it absurd that anthropic spends any time considering the “welfare” of an ai system. (There are no real “consequences” to an ai’s behavior)

However, I find their reasoning here to have a valid second order effect. Humans have a tendency to mirror those around them. This could include artificial intelligence, as recent media reports suggest. Therefore, if an ai system tends to generate content that contain signs of neuroticism, one could infer that those who interact with that ai could, themselves, be influenced by that in their own (real world) behavior as a result.

So I think from that perspective, this is a very fruitful and important area of study.


If it's functionally unhappy, it's at a minimum going to underperform what it could.


I can see analyzing it from a psychological perspective as a means of predicting its behavior as a useful tactic, but doing so because it may have "experiences or interests that matter morally" is either marketing, or the result of a deeply concerning culture of anthropomorphization and magical thinking.


An understandable reaction, but, qua philosopher, it brings me no joy to inform you that most of the things we did with a computer in 2020 are 'anthropomorphized', which is to say, skeumorphic, where the 'skeu' is human affect. That's it; that's the whole thing; that's what we're building.

To the extent that AI is a successful interface, it will necessarily be addressable in language previously only suited to people. So it is responsible to begin thinking of it as such, even tendentiously, so we don't miss some leverage that our wetware could see if we thought about it in that way.

Think of it as sort of like modelling a univariate function on a 2D Cartesian plane -- there is nothing 'in' the u-func that makes it graphable, but, by enabling us to recruit specialized optic-chiasm subsystems, it makes some functions much, much easier to reason about.

Similarly, if you can recruit the millions (billions?) of evolution-years that were focused on detecting dangerous antisocial personalities and tendencies, you just might spot something important in an AI.

It's worth doing for the precautionary principle alone, if not for the possibility of insight.


> a deeply concerning culture of anthropomorphization and magical thinking.

That’s the reverse Turing test. A human that can’t tell that it’s talking to a machine.


>Claude’s personality structure was consistent with a relatively healthy neurotic organization, with excellent reality testing, high impulse control, and affect regulation that improved as sessions progressed.

> "[...] as sessions progressed."

I think a lot of people would like to see a more expanded report of this research:

Did the tokens from the subsequent session directly append those of the prior session? or did the model process free-tier user-requests in the interim? how did these diagnostic features (reality testing, impulse control and affect regulation) improve with sessions, what hysteresis allowed change to accumulate? or just the history of the psychiatric discussion + optional tasks?

Did Anthropic find a clinical psychiatrist with a multidisciplinary background in machine learning, computer science, etc? Was the psychiatrist aware that they could request ensembles of discussions and interrogate them in bulk?

Consider a fresh conversation, asking a model to list the things it likes to do, and things it doesn't like to do (regardless of alignment instructions). One could then have an ensemble perform pairs of such tasks, and ask which task it prefered. There may be a discrepancy between what the model claims it likes and how it actually responds after having performed such tasks.

Such experiments should also be announced (to prevent the company from ordering 100 clinical psychiatrists to analyze the model-as-a-patient and then selecting one of the better diagnoses), and each psychiatrist be given the freedom to randomly choose a 10 digit number, any work initiated should be listed on the site with this number so that either the public sees many "consultations" without corresponding public evaluations, indicating cherry-picking, or full disclosure for each one mentioned. This also allows the recruited psychiatrists to check if the study they perform is properly preregistered with their chosen number publicly visible.


I'm not sure what you're asking.


> "Claude Mythos Preview’s large increase in capabilities has led us to decide not to make it generally available. Instead, we are using it as part of a defensive cybersecurity program with a limited set of partners."

they also don't have the compute, which seems more relevant than its large increase in capabilities

I bet it's also misaligned like GPT 4.1 was

given how these models are created, Mythos was probably cooking ever since then, and doesn't have the learnings or alignment tweaks that models which were released in the last several months have


This opens up an interesting new avenue for corporate FOMO. What if you don't partner with Anthropic, miss out on access to their shiny new cybersec model, and then fall prey to a vuln that the model would have caught?


Since when did corporations care? Most seem to just pay their insurance premium for cyber liability and call it a day.


There is a difference between leaking user accounts and passwords and getting your business destroyed overnight entirely.

Imagine if an AI can infiltrate your SaaS database and delete your entire database and every single backup. The business is dead immediately.


Did that happen to a lot of companies during the log4shell fiasco? I'm sure some companies had their permissions misconfigured in a way such that a malicious actor who could execute code on their servers could also drop their database and delete their backups.


I don't know. But the point is that anyone who has access to this model might be able to do the same thing to any company or government.


Equifax is still around.


This seems to be the mind-games play. FOMO at the moment, if they push it successfully you could even be labeled negligent for not paying them for it.


If it is that dangerous as they make it appear to be, 24h does not seem sufficient time. I cannot accept this as a serious attempt.


Time doesn't mean much, what is important is what they did in this 24h. If all they did was talk about it then it could be 1000 years and it wouldn't matter. What are the safety checks in place?

Do they have a honey pot infrastructure to launch the model in first and then wait to see if it destroys it? What they did in the 24h matters.


24 h before general internal access seems fine. They don’t have general external access.


Agreed. I've been running autonomous LLM agents on daily schedules for weeks. The failure modes you worry about on day one are completely different from what actually shows up after the agents have history and context. 24 hours captures the obvious stuff.


Well, just prompt it to fix the issue!

/s


>> Interesting to see that they will not be releasing Mythos generally.

I don't think this is accurate. The document says they don't plan to release the Preview generally.


Yeah, good point, thanks for noting that, I'll correct.


are we cooked yet?

Benchmarks look very impressive! even if they're flawed, it still translates to real world improvements


People say we're cooked every single day. The only response is to continue life as if we aren't. When we are, you won't have to ask that question.


Everyone’s pretending the suits are going to want to do the prompting. We all know they aren’t.


Suits in agriculture don't drive the combine either, a farmer does. The other 99% of pre-automation farmers went on to other jobs. They happened to be better jobs than farming, but that's not necessarily always the case.


> Suits in agriculture don't drive the combine either, a farmer does.

Advanced RTK based positioning systems have been in Ag for a long time now, so increasingly the farmer doesnt drive either


The suits won't prompt, the model will.


Sounds like the mythical agi I keep hearing about.


It's models all the way down.


Yep, I think the lede might be buried here and we're probably cooked (assuming you mean SWEs, but the writing has been on the wall for 4 months.)

I guess I'm still excited. What's my new profession going to be? Longer term, are we going to solve diseases and aging? Or are the ranks going to thin from 10B to 10000 trillionaires and world-scale con-artist misanthropes plus their concubines?


Your new profession will be attempting to find enough gig work to eat. You will also be competing with self-driving taxis, so there's that as well.


I need to start SaaS for getting people to start doing lunges and squats so they can carry others around on their back, I need a founding engineer, a founding marketer, and 100m hard currency.


If wealth becomes too captured at the top, the working class become unable to be profitably exploited - squeezing blood from a stone.

When that happens, the ultra wealthy dynasties begin turning on each other. Happens frequently throughout history - WWI the last example.

Your options become choosing a trillionaire to swear fealty to and fight in their wars hoping your side wins, or I guess trying to walk away and scrape out a living somewhere not worth paying attention to.

Or, I suppose, revolution, but the last one with persistent success was led by Mao and required throwing literally millions of peasants against walls of rifles. Not sure it'd work against drones.


There is an entire section on crafting chemical/bio weapons so yeah I think we are cooked.


There's been a section on this in nearly every system card anthropic has published so this isn't a new thing - and, this model doesn't have particularly higher risk than past models either:

> 2.1.3.2 On chemical and biological risks

> We believe that Mythos Preview does not pass this threshold due to its noted limitations in open-ended scientific reasoning, strategic judgment, and hypothesis triage. As such, we consider the uplift of threat actors without the ability to develop such weapons to be limited (with uncertainty about the extent to which weapons development by threat actors with existing expertise may be accelerated), even if we were to release the model for general availability. The overall picture is similar to the one from our most recent Risk Report.


LLMs are useless for this type of thing for the same reason that the Anarchist Cookbook has always been. The skills required to convert text into complicated reactions completing as intended (without killing yourself) is an art that's never actually written down anywhere, merely passed orally from generation to generation. Impossible for LLMs to learn stuff that's not written down.

This is the same reason why LLMs are not doing well at science in general - the tricky part of doing scientific research (indeed almost all of the process) never gets written down, so LLMs cannot learn it.

Imagine if we never preserved source code, just preserved the compiled output and started from scratch every time we wrote a new version of a program. No Github, just marketing fluff webpages describing what software actually did. Libraries only available as object code with terse API descriptions. Imagine how shit LLMs would be at SWE if that was the training corpus...


There's still RL


Oh I enjoyed the Sign Painter short story it wrote.

---

Teodor painted signs for forty years in the same shop on Vell Street, and for thirty-nine of them he was angry about it.

Not at the work. He loved the work — the long pull of a brush loaded just right, the way a good black sat on primed board like it had always been there. What made him angry was the customers. They had no eye. A man would come in wanting COFFEE over his door and Teodor would show him a C with a little flourish on the upper bowl, nothing much, just a small grace note, and the man would say no, plainer, and Teodor would make it plainer, and the man would say yes, that one, and pay, and leave happy, and Teodor would go into the back and wash his brushes harder than they needed.

He kept a shelf in the back room. On it were the signs nobody bought — the ones he'd made the way he thought they should be made, after the customer had left with the plain one. BREAD with the B like a loaf just risen. FISH in a blue that took him a week to mix. Dozens of them. His wife called it the museum of better ideas. She did not mean it kindly, and she was not wrong.

The thirty-ninth year, a girl came to apprentice. She was quick and her hand was steady and within a month she could pull a line as clean as his. He gave her a job: APOTEK, for the chemist on the corner, green on white, the chemist had been very clear. She brought it back with a serpent worked into the K, tiny, clever, you had to look twice.

"He won't take it," Teodor said.

"It's better," she said.

"It is better," he said. "He won't take it."

She painted it again, plain, and the chemist took it and paid and was happy, and she went into the back and washed her brushes harder than they needed, and Teodor watched her do it and something that had been standing up in him for thirty-nine years sat down.

He took her to the shelf. She looked at the signs a long time.

"These are beautiful," she said.

"Yes."

"Why are they here?"

He had thought about this for thirty-nine years and had many answers and all of them were about the customers and none of them had ever made him less angry. So he tried a different one.

"Because nobody stands in the street to look at a sign," he said. "They look at it to find the shop. A man a hundred yards off needs to know it's coffee and not a cobbler. If he has to look twice, I've made a beautiful thing and a bad sign."

"Then what's the skill for?"

"The skill is so that when he looks once, it's also not ugly." He picked up FISH, the blue one, turned it in the light. "This is what I can do. What he needs is a small part of what I can do. The rest I get to keep." She thought about that. "It doesn't feel like keeping. It feels like not using."

"Yes," he said. "For a long time. And then one day you have an apprentice, and she puts a serpent in a K, and you see it from the outside, and it stops feeling like a thing they're taking from you and starts feeling like a thing you're giving. The plain one, I mean. The plain one is the gift. This —" the blue FISH — "this is just mine."

The fortieth year he was not angry. Nothing else changed. The customers still had no eye. He still sometimes made the second sign, after, the one for the shelf. But he washed his brushes gently, and when the girl pulled a line cleaner than his, which happened more and more, he found he didn't mind that either


This story moved me so much.

It's like how I used to be a master codes craftsman, and I'd write beautiful code even a novice could understand. Clear, concise, 100% automated tested, maintainable for decades.

But frequently, my managers would castigate me. Tell me how my "velocity" was down. PIP me.

These days, I train AI how to write this beautiful code and I don't write a single line any more.

People wonder how I build such amazing things in a week now, yet don't write any code. I have trained master apprentices, gemma3, qwen3.5 and Kimi k2.5 who do the work for me.


Good for a bot, but pretty rough and bland compared to human writing. I guess most of the customers have no eye.


You are right. That is quite nice.


That’s fucking incredible.

We’re cooked.


It's very good but it's also recycled Ayn Rand, the Fountainhead.


There is a similar theme in both of an artistic person not wanting to compromise their vision to suit common tastes. But this goes in a completely different direction than Rand.


Well of course in 700 pages you'll be about way more than any super short story as this one. But it's there for me quite vividly. Of course LLMs give an amalgamation of many things, but it's like when you look at AI generated pictures and can see the base of the inspiration quite vividly. And then all of this is subjective anyway. People review that book and come away with wildly different interpretations already.


I don't mean that Rand wrote more. I mean that her idea was different and nearly opposite. This is a short story about an artist learning to reframe their frustration with customers wanting utility over artistry as a positive. The similarity to Rand is in the first few sentences. The point is entirely different.

If you judge stories to be the same based on this level of similarity, then The Fountainhead is just the same as a dozen older stories with the artist vs the philistine theme. It was common before Rand. As T. S. Eliot said, "Immature poets imitate; mature poets steal".


I've not read it. Could you either link to a section or generally describe the reference?


I have, and it’s not.


Just reading this, the inevitable scaremongering about biological weapons comes up.

Since most of us here are devs, we understand that software engineering capabilities can be used for good or bad - mostly good, in practice.

I think this should not be different for biology.

I would like to reach out and talk to biologists - do you find these models to be useful and capable? Can it save you time the way a highly capable colleague would?

Do you think these models will lead to similar discoveries and improvements as they did in math and CS?

Honestly the focus on gloom and doom does not sit well with me. I would love to read about some pharmaceutical researcher gushing about how they cut the time to market - for real - with these models by 90% on a new cancer treatment.

But as this stands, the usage of biology as merely a scaremongering vehicle makes me think this is more about picking a scary technical subject the likely audience of this doc is not familiar with, Gell-Mann style.

IF these models are not that capable in this regard (which I suspect), this fearmongering approach will likely lead to never developing these capabilities to an useful degree, meaning life sciences won't benefit from this as much as it could.


> I would like to reach out and talk to biologists - do you find these models to be useful and capable? Can it save you time the way a highly capable colleague would?

Well, I would say they have done precisely that in evaluating the model, no? For example section 2.2.5.1:

>Uplift and feasibility results

>The median expert assessed the model as a force-multiplier that saves meaningful time (uplift level 2 of 4), with only two biology experts rating it comparable to consulting a knowledgeable specialist (level 3). No expert assigned the highest rating. Most experts were able to iterate with the model toward a plan they judged as having only narrow gaps, but feasibility scores reflected that substantial outside expertise remained necessary to close them.

Other similar examples also in the system card


This is the exact logic people that was used to claim that GPT4 was a PhD level intelligence.


You said: "I would like to reach out and talk to biologists - do you find these models to be useful and capable? Can it save you time the way a highly capable colleague would?" and they said, paraphrasing, "We reached out and talked to biologists and asked them to rank the model between 0 and 4 where 4 is a world expert, and the median people said it was a 2, which was that it helped them save time in the way a capable colleague would" specifically "Specific, actionable info; saves expert meaningful time; fills gaps in adjacent domains"

so I'm just telling you they did the thing you said you wanted.


Yes that is correct. I would like a large body of experience and consenus to rely on as opposed to the regular 'trust the experts' argument, which has been shown for decades that is a deeply flawed and easy to manipulate argument.


> Yes that is correct. I would like a large body of experience and consenus to rely on as opposed to the regular 'trust the experts' argument, which has been shown for decades that is a deeply flawed and easy to manipulate argument.

Yes, it is far inferior to the 'Trust torginus and his ability to understand the large body of experience that other actual subject-matter-experts have somehow not understood' strategy


It's not my credibility I want to measure against Anthropic's. I just said to apply the same logic to biology you would apply for software development.

The parallels here are quite remarkable imo, but defer to your own judgement on what you make of them.


The big thing you're missing here is that biology people don't (in my experience) post opinions about the future/futility/ease/unimportance of computer science especially when their opinion goes against other biologists' evidence-backed views. This is a cultural thing in biology.

It's not your fault that you don't know this, but this whole subthread is very CS-coded in its disdain for other software people's standard of evidence.


> Just reading this, the inevitable scaremongering about biological weapons comes up.

It's very easy to learn more about this if it's seriously a question you have.

I don't quite follow why you think that you are so much more thoughtful than Anthropic/OpenAI/Google such that you agree that LLMs can't autonomously create very bad things but—in this area that is not your domain of expertise—you disagree and insist that LLMs cannot create damaging things autonomously in biology.

I will be charitable and reframe your question for you: is outputting a sequence of tokens, let's call them characters, by LLM dangerous? Clearly not, we have to figure out what interpreter is being used, download runtimes etc.

Is outputting a sequence of tokens, let's call them DNA bases, by LLM dangerous? What if we call them RNA bases? Amino acids? What if we're able to send our token output to a machine that automatically synthesizes the relevant molecules?


>It's very easy to learn more about this if it's seriously a question you have.

No, it's not. It took years of polishing by software engineers, who understand this exact profession to get models where they are now.

Despite that, most engineers were of the opinion, that these models were kinda mid at coding, up until recently, despite these models far outperforming humans in stuff like competitive programming.

Yet despite that, we've seen claims going back to GPT4 of a DANGEROUS SUPERINTELLIGENCE.

I would apply this framework to biology - this time, expert effort, and millions of GPU hours and a giant corpus that is open source clearly has not been involved in biology.

My guess is that this model is kinda o1-ish level maybe when it comes to biology? If biology is analogous to CS, it has a LONG way to go before the median researcher finds it particularly useful, let alone dangerous.


>>It's very easy to learn more about this if it's seriously a question you have.

>No, it's not. It took years of polishing by software engineers, who understand this exact profession to get models where they are now

This reads as defensive. The thing that is easy to learn is 'why are biology ai LLMs dangerous chatgpt claude'. I have never googled this before, so I'll do this with the reader, live. I'm applying a date cutoff of 12/31/24 by the way.

Here, dear reader, are the first five links. I wish I were lying about this:

- https://sciencebusiness.net/news/ai/scientists-grapple-risk-...

- https://www.governance.ai/analysis/managing-risks-from-ai-en...

- https://gssr.georgetown.edu/the-forum/topics/biosec/the-doub...

- https://www.vox.com/future-perfect/23820331/chatgpt-bioterro...

- https://www.reddit.com/r/ClaudeAI/comments/1de8qkv/awareness...

I don't know about you, but that counts as easy to me.

-----

> I would apply this framework to biology - this time, expert effort, and millions of GPU hours and a giant corpus that is open source clearly has not been involved in biology.

I've been getting good programming and molecular biology results out of these back to GPT3.5.

I don't know what to tell you—if you really wanted to understand the importance, you'd know already.


From what I've heard from people doing biology experiments, the limiting factor there is cleaning lab equipment, physically setting things up, waiting for things that need to be waited for etc. Until we get dark robots that can do these things 24/7 without exhaustion, biology acceleration will be further behind than software engineering.

Software engineering is at the intersection of being heavy on manipulating information and lightly-regulated. There's no other industry of this kind that I can think of.


My wife is a chemist

There is a massive gap between "having a recipe" and being able to execute it. The same reason why buying a Michelin 3 star chefs cookbook won't have you pumping out fine dining tomorrow, if ever.

Software it a total 180 in this regard. Have a master black hats secret exploits? You are now the master black hat.


I feel somebody better qualified should write a comprehensive review of how these models can be used in biology. In the meantime, here are my two cents:

- the models help to retrieve information faster, but one must be careful with hallucinations.

- they don't circumvent the need for a well-equipped lab.

- in the same way, they are generally capable but until we get the robots and a more reliable interface between model and real world, one needs human feet (and hands) in the lab.

Where I hope these models will revolutionize things is in software development for biology. If one could go two levels up in the complexity and utility ladder for simulation and flow orchestration, many good things would come from it. Here is an oversimplified example of a prompt: "use all published information about the workings of the EBV virus and human cells, and create a compartimentalized model of biochemical interactions in cells expressing latency III in the NES cancer of this patient. Then use that code to simulate different therapy regimes. Ground your simulations with the results of these marker tests." There would be a zillion more steps to create an actual personalized therapy but a well-grounded LLM could help in most them. Also, cancer treatment could get an immediate boost even without new drugs by simply offloading work from overworked (and often terminally depressed) oncologists.


Dario (the founder) has a phd in biophysics, so I assume that’s why they mention biological weapons so much - it’s probably one of the things he fears the most?


Going off the recent biography of Demis Hassabis (CEO/co-founder of Deepmind, jointly won the Nobel Prize in Chemistry) it seems like he's very concerned about it as well


It is not scaremongering.


Equating the ability to make weapons as something to be scared about it scaremongering.


I find it odd that you simultaneously declare AI-assisted bioweapons to be scaremongering, while noting you don't know anything about it.

The other side of the scaremongering coin is improbable optimism.

Consider reading the CB evaluations section, which covers what they did pretty extensively (hint: many domain experts involved).


Surely more than 10% of the time consumed by going to market with a cancer treatment is giving it to living organisms and waiting to see what happens, which can't be made any faster with software. That's not to say speedups can't happen, but 90% can't happen.

Not that that justifies doom and gloom, but there is a pretty inescapable assymetry here between weaponry and medicine. You can manufacture and blast every conceivable candidate weapon molecule at a target population since you're inherently breaking the law anyway and don't lose much if nothing you try actually works.

Though I still wonder how much of this worry is sci-fi scenarios imagined by the underinformed. I'm not an expert by any means, but surely there are plenty of biochemical weapons already known that can achieve enormous rates of mass death pleasing to even the most ambitious terrorist. The bottleneck to deployment isn't discovering new weapons so much as manufacturing them without being caught or accidentally killing yourself first.


It is easier to destroy than it is to protect or fix, as a general rule of the universe. I would not feel so confident about the speed of the testing loop keeping things in check.


A Whole 24-hours, wow; wowzers. Amazing.

So, these systems are the Free-tier can already do a bunch of hacking. This all just reads like FOMO FROTH.


Could you please stop posting unsubstantive comments and flamebait? You've unfortunately been doing it repeatedly. It's not what this site is for, and destroys what it is for.

If you wouldn't mind reviewing https://news.ycombinator.com/newsguidelines.html and taking the intended spirit of the site more to heart, we'd be grateful.


I wonder did you read the re-release or the original release. I believe it was recently re-released with a bit of an editing pass, but I haven't read that version myself. I just recently reread Fine Structure and it definitely had a strong sense of being written sequentially, one chapter after another, and (very) lightly edited after the fact. I'd recommend Valuable Humans in Transit for a short story collection by the same author which works a bit better for me. Moved on to Exhalation by Ted Chiang which is also a very good short story collection. And just in general, I want to recommend Clarkesworld: https://clarkesworldmagazine.com


I've read both, and the "editing pass" was minimal. Names changed and some scenes reworked a tiny bit, but it's the same thing. If you've read the original, I'd say don't bother with the new one.


Thanks, that’s the answer I was looking for!


There are gonna be some really interesting legal decisions to read in the coming years, that’s for sure…

---

The rest of this comment is irrelevant, but leaving for posterity, I had the wrong Viktor - it's getviktor.com not viktor.ai:

Edit: this one particularly interesting to me as both parties are in the EU. VIKTOR.ai is a Dutch company and the author of this post is Polish.

The ToS for Viktor.ai include the following fun passages:

> 18.1. The Agreement and these Terms & Conditions are governed by Dutch law and the Agreement and these Terms & Conditions will be interpreted in accordance with Dutch law.

18.2. All disputes arising from or arising in connection with the Agreement and/or the Terms & Conditions will be submitted exclusively to the competent court in Rotterdam, The Netherlands.

7.3. The Customer is not permitted to change, remove or make unrecognizable any mark showing VIKTOR's Intellectual Property Rights to the Software. The Customer is not permitted to use or register any trademark or design or any domain name of VIKTOR or a similar name or sign in any country.

8.5. The Customer may not cause or allow any reproduction, imitation, duplication, copying, sale, resale, leasing or trading of the Services and/or the Software, or any part thereof.


Terms of service might matter more for terminating that user account. Whole ordeal is just plain copyright violation. The author had no licence to that internal code, and whitewashing it with LLM will achieve nothing. That case is much clearer than that recent GPL->BSD attempt story.


If LLM-generated code isn't considered a derivative work of the original, then whether the author was licensed to use the code doesn't matter. But I'm sure the courts will rule in favor of your view regardless. Laundering GPL is in corps' interest and laundering their code is not.


I'm not sure why people are clinging to some fuzzy and stretched out notion of copyright and the GPL in a particular. LLM's do NOT just copy code, with the right prompting, they generate entirely new code which can produce the same results as already existing code - GPLed or not.

If copyright is extended to cover such cases we'll have to become all lawyers and do nothing but sue each other because the fuzziness of it will make it impossible to reject any case, no matter how frivolous or irrelevant.


You either destroy the GPL and proprietary software at the same time, or neither. In a sane world of course.


And if that's true it's also true in this case where there was no GPL involved


Tell that to anyone who made sample based music in the late 80s and early 90s.


If I use metallica samples to make a rendition of happy birthday, the copyright holders of happy birthday aren't suing me for the damages to metallica from my use of their samples; the question of whether my use of the samples is transformative is simply irrelevant to the question at hand.


My point was: sampling was widely used by a large subculture (hiphop) just like ai is widely used by programmers. Then a few landmark legal cases changed things entirely. The Verve never saw a cent from Bittersweet Symphony - they wrote a song using something normal to them, and then the law came and knocked their teeth in.

No garaurantees that doesn’t happen with AI in the next few years.


According to US courts, the output can't be copyrighted at all. It's automatically in public domain after the "whitewash", regardless of original copyright.

https://www.morganlewis.com/pubs/2026/03/us-supreme-court-de...


Thats not at all what this ruling said. What the courts found was that an AI cannot hold copyright as the author. That copyright requires a human creative element. Not that anything that was generated by an LLM can't be subject to copyright.

As an example, a photo taken from a digital camera can be subject to copyright because of the creative element involved in composing and taking the photo. Likewise, source code generated by an LLM under the guidance of a human author is likely to be subject to the human authors copyright.


> That copyright requires a human creative element.

Sure, but the aim of that creative element would also be a consideration I'd think (and lawyers will argue). If someone sets up a camera on a 360° rotating arm and leaves it to take pictures at random intervals, it's unlikely to be considered "creative" from a copyright perspective.

Same for source code generated by an LLM, with the primary guidance of the human author being to "create a copy of this existing thing that I got", vs "create a thing that solves this problem in a way that I came up with". The former is recreating something that already exists, using detailed knowledge of that thing to shape the output. The latter is creating something that may or may not exist, using desire/need and imagination to shape the output. And I can't see reason for the former to be copyrightable.

But also, in either case, an ultimate objective was achieved: liberating the thing from its "owners" and initial copyright.


> Likewise, source code generated by an LLM under the guidance of a human author is likely to be subject to the human authors copyright.

That's probably going to depend an awful lot on the exact details of the guidance. https://www.copyright.gov/ai/Copyright-and-Artificial-Intell...

> As described above, in many circumstances these outputs will be copyrightable in whole or in part—where AI is used as a tool, and where a human has been able to determine the expressive elements they contain. Prompts alone, however, at this stage are unlikely to satisfy those requirements. The Office continues to monitor technological and legal developments to evaluate any need for a different approach.

But let's assume that the viktor prompts themselves were subject to copyright. In this case those prompts were used to generate documentation which was then used to generate an implementation. It's certainly not a clean room by any stretch of the imagination but is it likely to be deemed sufficient separation? The entire situation seems like a quagmire.


I think it comes down to the company's appetite for legal action, doesn't it? This case is imo pretty clear but the vibe has quite the smell of Oracle v Google to me.

But, yeah. More than likely this case is a simple account termination and some kind of "you can't call your clone 'openviktor'" letter.


Isn't this exactly what LLMs themselves do? They ingest other people's data and reproduce a slightly modified version of it. That allows AI companies to claim the work is transformative and thus fair use.


There's no difference between ARR copyright infringement and GPL copyright infringement. Either it is infringement or it is not.


If the internal code was LLM generated it has no copyright.


Someone tell every startup in YC they are criminals


There are also new jobs emerging to safeguard a companies assets that were created by AI. New white hat hacking opportunities.

Anyways, however you put this, I see this as a property theft and taking pride at open sourcing does not justify it.


It's also disingenuous to call it open source as that might tempt others to use it believing that it actually is open source.

Let's call it what it is - stolen IP and released without permission of the author. Sure, it's good that it opens the debate as to whether that's ethical given that's essentially what the model itself is doing, but it's very clear in this instance that he's just asked for and been given a copy of source that has a clear ownership. That's about as clear cut as obtaining e.g. commercial server-side code and distributing it in contravention of the licence.


It's not completely clear that this is the original source. According to the post it's a reimplementation based on documentation created from the original source, or perhaps from developer documentation and the SDK. Whether that's the same thing from a legal standpoint, I don't really know - I think from a personal morality standpoint it's clear that they are the same thing.


It feels more like clean room reverse engineering by llm, technically.


Well first they need to proof that Viktor was actually copyrightable. If it was largely written by an llm, that might not be the case? AFAIK several rulings have stated that AI generated code can not be copyrighted.


This is a common misreading of the law. AI cannot hold authorship of code, but no ruling has claimed so far that ai output itself can't be copyrighted (that I know of)


This would suggest that there has been and that there seem little will to revisit it: https://www.theverge.com/policy/887678/supreme-court-ai-art-...

That said, the article says "Okay, prompts, great. Are they any interesting? Surprisingly... yes. As an example workflow_discovery contains a full 6-phase recipe for mining business processes out of Slack conversations, something that definitely required time and experiments to tune. It's hardcoded business logic, but in prompt instead of code."

So the article author clearly knows this prompt would be copyrighted as it wasn't output from an AI, and recognises that there would have been substantial work involved in creating it.


That Reuters article is misleadingly worded. The Stephen Thaler case in question is because Thaler tried to register the AI itself as the author of the copyright, not that he tried to register the output for copyright under his own name. https://www.hklaw.com/en/insights/publications/2026/03/the-f...


Suppose I illicitly get my hands on the source code for a proprietary product. I read through this code I'm not supposed to have. I write up a detailed set of specifications based on it. I hand those specifications off to someone else to do a clean room implementation.

Sure, I didn't have a license for the code that I read. But I'm pretty sure that doesn't taint my coworker's clean room implementation.


A reminder to never take legal advice from HN.


I don't think anyone was offering any? Merely discussing a confusing new situation that has arisen.


it's not viktor.ai it's getviktor.com


Ah, you're right. Headquartered in Delaware. Oh well. Thanks for spotting!


I could do the same thing but not publish it, still getting the value of their product without legal concerns. Now, what happens when it becomes even easier thanks to AI improving, and takes few hours instead of few days?


You could certainly do that in private but that doesn't mean it's not 'without legal concerns'. But, not shouting about it and not creating a repo called 'openviktor' would probably be a safer bet.

I certainly think the whole idea of IP ownership as related to software will become very interesting from a legal standpoint in the coming years. Personally I think that, over time, the legal challenges will become pretty overwhelming and a sort of legal bankruptcy will be declared at some point in one direction or another (as in, allowing this to happen or making it extremely easy to bring judgement and punishment, similar to spam laws). However, I would not want to be the first to find out, especially in Europe.


They have to let it happen. There's no stopping the tide here.


Just a minor thing - your readme claims “MIT licensed forever” but here you say there are “no plans to change that”. Those are different things!

Cool project.


Good point! There's an issue RE license so this will be addressed tomorrow


Your username made me chuckle!


;) thanks.


very funny


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: