Hacker Newsnew | past | comments | ask | show | jobs | submit | theropost's commentslogin

I think there is a real issue here, but I do not think it is as simple as calling it theft in the same way as copying books. The bigger problem is incentives. We built a system where writing docs, tutorials, and open technical content paid off indirectly through traffic, subscriptions, or services. LLMs get a lot of value from that work, but they also break the loop that used to send value back to the people and companies who created it.

The Tailwind CSS situation is a good example. They built something genuinely useful, adoption exploded, and in the past that would have meant more traffic, more visibility, and more revenue. Now the usage still explodes, but the traffic disappears because people get answers directly from LLMs. The value is clearly there, but the money never reaches the source. That is less a moral problem and more an economic one.

Ideas like GPL-style licensing point at the right tension, but they are hard to apply after the fact. These models were built during a massive spending phase, financed by huge amounts of capital and debt, and they are not even profitable yet. Figuring out royalties on top of that, while the infrastructure is already in place and rolling out at scale, is extremely hard.

That is why this feels like a much bigger governance problem. We have a system that clearly creates value, but no longer distributes it in a sustainable way. I am not sure our policies or institutions are ready to catch up to that reality yet.


> We have a system that clearly creates value, but no longer distributes it in a sustainable way

The same thing happened (and is still happening) with news media and aggregation/embedding like Google News or Facebook.

I don't know if anyone has found a working solution yet. There have been some laws passed and licensing deals [1]. But they don't really seem to be working out [2].

[1] https://www.cjr.org/the_media_today/canada_australia_platfor...

[2] https://www.abc.net.au/news/2025-04-02/media-bargaining-code...


I'm not sure that I'd call [2] it not working out, just like I wouldn't call the equivalent pressure from the USA to dismantle medicare our public health system not working out.

The biggest issue with the scheme is the fact that it was structured to explicitly favour media incumbents, and is therefore politically unpopular.


> I do not think it is as simple as calling it theft in the same way as copying books

Aside from the incentive problem, there is a kind of theft, known as conversion: when you were granted a license under some conditions, and you went beyond them - you kept the car past your rental date, etc. In this case, the documentation is for people to read; AI using it to answer questions is a kind of conversion (no, not fair use). But these license limits are mostly implicit in the assumption that (only) people are reading, or buried in unenforceable site terms of use. So it's a squishy kind of stealing after breaching a squishy kind of contract - too fuzzy to stop incented parties.


The problem is there was a social contract. Someone spent their time and money to create a product that they shared for free, provided you visit their site and see their offerings. In this way they could afford to keep making this free product that everyone benefited from.

LLMs broke that social contract. Now that product will likely go away.

People can twist themselves into knots about how LLMs create “value” and that makes all of this ok, but the truth is they stole information to generate a new product that generates revenue for themselves at the cost of other people’s work. This is literally theft. This is what copyright law is meant to protect. If LLM manufacturers are making money off someone’s work, they need to compensate people for that work, same as any client or customer.

LLMs are not doing this for the good of society. They themselves are making money off this. And I’m sure if someone comes along with LLM 2.0 and rips them off, they’re going to be screaming to governments and attorneys for protection.

The ironic part of all of this is that LLMs are literally killing the businesses they need to survive. When people stop visiting (and paying) Tailwind, Wikipedia, news sites, weather, and so on, and only use LLMs, those sites and services will die. Heck, there’s even good reason to think LLMs will kill the Internet at large, at least as an information source. Why in the hell would I publish news or a book or events on the Internet if it’s just going to be stolen and illegally republished through an LLM without compensating me for my work? Once this information goes away or is locked behind nothing but paywalls, I hope everyone is ready for the end of the free ride.


There will be no royalties, simply make all the models that trained on the public internet also be required to be public.

This won't help tailwind in this case, but it'll change the answer to "Should I publish this thing free online?" from "No, because a few AI companies are going to exclusively benefit from it" to "Yes, I want to contribute to the corpus of human knowledge."


Contributing to human knowledge doesn’t pay the bills though

It can. The problem is the practice of using open source as a marketing funnel.

There are many projects that love to brag about being open source (it's "free"!), only to lock useful features behind a paywall, or do the inevitable license rug pull after other companies start profiting from the freedoms they've provided them. This is the same tactic used by drug dealers to get you hooked on the product.

Instead, the primary incentive to release a project as open source should be the desire to contribute to the corpus of human knowledge. That doesn't mean that you have to abandon any business model around the project, but that shouldn't be your main goal. There are many successful companies built around OSS that balance this correctly.

"AI" tools and services corrupt this intention. They leech off the public good will, and concentrate the data under the control of a single company. This forces well-intentioned actors to abandon open source, since instead of contributing to human knowledge, their work contributes to "AI" companies. I'm frankly not upset when this affects projects who were abusing open source to begin with.

So GP has a point. Forcing "AI" tools, and even more crucially, the data they collect and use, to be free/libre, would restore the incentive for people to want to provide a public good.

The narrative that "AI" will bring world prosperity is a fantasy promoted by the people who will profit the most. The opposite is true: it will concentrate wealth and power in the hands of a few even more than it is today. It will corrupt the last vestiges of digital freedoms we still enjoy today.

I hope we can pass regulation that prevents this from happening, but I'm not holding my breath. These people are already in power, and governments are increasingly in symbiotic relationships with them.


It's not as simple as calling it theft, but it is simply theft, plus the other good points you made.

Copying is theft, generating is theft, and it is not even taking anything they had. Future revenue can't be stolen.

I think once it becomes infrastructure and widely used knowledge the authors can't claim control anymore. Or shouldn't.


> Future revenue can't be stolen.

This is a big eye-roll but otherwise ya, this is one way to think of it. It's not all about money, though. The people running these companies are just taking, en masse, without credit. This is a basic human desire. Of course there is a discussion of whether or not we should evolve beyond that. It feels incredibly dystopian to me, though.


> We have a system that clearly creates value, but no longer distributes it in a sustainable way.

It does not "create value" it harvests value and redirects the proceeds it accrues towards its owners. The business model is a middleman that arbitrages the content by separating it from the delivery.

Software licensing has been broken for 2 decades. That's why free software isn't financially viable for anybody except a tiny minority. It should be. The entire industry has been operating by charity. The rich mega corporations have decided they're not longer going to be charitable.


I think this kind of critique often leans too hard on “security through obscurity” as a cheap punchline, without acknowledging that real systems are layered, pragmatic, and operated by humans with varying skill levels. An open firmware repository, by itself, is not a failure. In many cases it is the opposite: transparency that allows scrutiny, reproducibility, and faster remediation. The real risk is not that attackers can see firmware, but that defenders assume secrecy is doing work that proper controls should be doing anyway.

What worries me more is security through herd mentality, where everyone copies the same patterns, tooling, and assumptions. When one breaks, they all break. Some obscurity, used deliberately, can raise the bar against casual incompetence and lazy attacks, which, frankly, account for far more incidents than sophisticated adversaries. We should absolutely design systems that are easy to operate safely, but there is a difference between “simple to use” and “safe to run critical infrastructure.” Not every button should be green, and not every role should be interchangeable. If an approach only works when no one understands it, that is bad security. But if it fails because operators cannot grasp basic layered defenses, that is a staffing and governance problem, not a philosophy one.


I’m beginning to think maybe I’m the only one that read this whole thing. The firmware storage isn’t the security through obscurity problem being talked about here. The hardcoded TLS private key definitely is though. And yes, it deserves shaming… terrible practice leads to terrible outcomes. Nobody is surprised that this is coming from tp-link at this point though.


> An open firmware repository, by itself, is not a failure

Isn’t the complaint that the location of the repo is not publicized?

Nobody would complain if it were linked directly from the company’s web page, I assume?


Just came back online here


Interesting point about touchscreens..I think it highlights a bigger issue with “safety” features sometimes backfiring. For example, that relentless beeping when the passenger seat detects weight but it’s just a backpack or groceries. I wonder how many drivers have been more distracted trying to silence the alarm than they would’ve been just ignoring the bag in the first place. Feels like we’ve traded one kind of risk for another. Do they really research this, or is it more of a gimmmic


But what if everything scales but what if no matter how complicated how obscure how mundane how niche what if everything I mean everything scales


Everything I mean everything doesn't scale.


Just tossing in my two cents - half the $25K cars people are asking for do exist, or did, but we’re basically banning them from the country with tariffs. It’s like we’re saying, “nah, we don’t really want cheap cars.”

Look at something like the Dolphin from China - it’s going for $8K–$9K USD over there. Ship a whole fleet of them and you’re still well under $25K. And we’re not talking junkers either - these are electric, decent build quality, ~300km range. Like... what exactly are we protecting here?

Feels like we’re pricing affordability out of the market on purpose.


Honestly, I’ve been thinking about this whole AGI timeline talk—like, people saying we’re going to hit some major point by 2027 where AI just changes everything. And to me, it feels less like a purely tech-driven prediction and more like something being pushed. Like there’s an agenda behind it, probably coming from certain elites or people in power, especially in the West, who see the current system and think it needs a serious reset.

What’s really happening, in my view, is a forced economic shift. We’re heading into a kind of engineered recession—huge layoffs, lots of instability—where millions of service and admin-type jobs are going to disappear. Not because the tech is ready in a full AGI sense, but because those roles are the easiest to replace with automation and AI agents. They’re not core to the economy, and a lot of them are wrapped in red tape anyway.

So in the next couple years, I think we’ll see AI being used to clear out that mental bureaucracy—forms, paperwork, pointless approvals, inefficient systems. AI isn’t replacing deep creativity or physical labor yet, but it is filling in the cracks and acting like a smart band-aid. It’ll seem useful and “intelligent,” but it’s really just a transition tool.

And once that’s done, the next step is workforce reallocation—pushing people into real-world industries where hands-on labor still matters. Building, manufacturing, infrastructure, things that can’t be automated yet. It’s like the short-term goal is to use AI to wipe out all the mindless middle-layers of the system, and the longer-term vision is full automation—including robotics and real-world systems—maybe 10 or 20 years out.

But right now? This all looks like a top-down move to shift the population out of the “mind” industries and into something else. It’s not just AI progressing—it’s a strategic reset, wrapped in the language of innovation.


My take is less tinfoil-hatty than this.

I simply think that the majority of people in AI today are scifi nerds who want to live out these fantisies and want to be part of something much larger than they are.

Theres also the obvious incentive from AI companies that automating everything is extremely lucrative (i.e, they stand to gain lots of money/power from the hype and in the event that AGI is real).


> pushing people into real-world industries where hands-on labor still matters.

Your average worker in the Anglosphere is forty-plus and pre-diabetic, has a weak back and joints, and has low cardiovascular fitness, from decades of sitting down. It'll go swimmingly!

This reset just impoverishes everybody including those pushing it. Maybe they are Lovecraftian monsters that feed off mass pain.


I wish my AI would tell me when I'm going in the wrong direction, instead of just placating my stupid request over and over until I realize.. even though it probably could have suggested a smarter direction, but instead just told me "Great idea! "


I don't know if you have used 2.5, but it is the first model to disagree with directions I have provided...

"..the user suggests using XYZ to move forward, but that would be rather inefficient, perhaps the user is not totally aware of the characteristics of XYZ. We should suggest moving forward with ABC and explain why it is the better choice..."


Have you noticed the most recent one, gemini-2.5-pro-0506, suddenly being a lot more sycophantic than gemini-2.5-pro-0325? I was using it to beta-read and improve a story (https://news.ycombinator.com/item?id=43998269), and when Google flipped the switch, suddenly 2.5 was burbling to me about how wonderful and rich it was and a smashing literary success and I could've sworn I was suddenly reading 4o output. Disconcerting. And the AI Studio web interface doesn't seem to let you switch back to -0325, either... (Admittedly, it's free.)


It really gave me a lot of push back once when I wanted to use a js library over a python one for a particular project. Like I gave it my demo code in js and it basically said, "meh, cute but use this python one because ...reasons..."


Wow, you can now pay to have „engineers” being overruled by artificial „intelligence”? People who have no idea are now going to be corrected by an LLM which has no idea by design. Look, even if it gets a lot of things right it’s still trickery.

I get popcorn and wait for more work coming my way 5 years down the road. Someone will have tidy this mess up and gen-covid will have lost all ability to think on their own by then.


You must be confusing „intelligence” with „statistically most probable next word”.


One trick I found is to tell the llm that an llm wrote the code, whether it did or not. The machine doesn't want to hurt your feelings, but loves to tear apart code it thinks it might've wrote.


I like just responding with "are you sure?" continuously. at some point you'll find it gets stuck in a local minima/maxima, and start oscillating. Then I backtrack and look at where it wound up before that. Then I take that solution and go to a fresh session.


Isn’t this sort of what the reasoning models are doing?


Except they have no concept of what "right" is, whereas I do. Once it seems to gotten itself stuck in left field I go back a few iterations and see where it was.


150 lines? I find can quickly scale to around 1500 lines, and then start more precision on the classes, and functions I am looking to modify


It's completely broken for me over 400 lines (Claude 3.7, paid Cursor)

The worst is when I ask something complex, the model generates 300 lines of good code and then timeouts or crashes. If I ask to continue it will mess up the code for good, eg. starts generating duplicated code or functions which don't match the rest of the code.


It's a new skill that takes time to learn. When I started on gpt3.5 it took me easily 6 months of daily use before I was making real progress with it.

I regularly generate and run in the 600-1000LOC range.

Not sure you would call it "vibe coding" though as the details and info you provide it and how you provide it is not simple.

I'd say realistically it speeds me up 10x on fresh greenfield projects and maybe 2x on mature systems.

You should be reading the code coming out. The real way to prevent errors is read the resoning and logic. The moment you see a mistep go back and try the prompt again. If that fails try a new session entirely.

Test time compute models like o1-pro or the older o1-preview are massively better at not putting errors in your code.

Not sure about the new claude method but true, slow test time models are MASSIVELY better at coding.


The “go back and try the prompt again” is the workflow I’d like to see a UX improvement on. Outside of the vibe coding “accept all” path, reverse traversing is a fairly manual process.


I don't think you will realistically.

Having full control over inputs and if something goes wrong starting a new chat with either narrower scope or clearer instructions is basically AGI level work.

There is nobody but a human for now that can determine how bad an LLM actually screwed up its logic train.

But maybe you mean pure UI?

I could forsee something like a new context creation button that gives a nice UI of what to bring over and what to ditch from the UI as pretty nice.

Maybe like a git diff looking method? Drop this paragraph bring this function by just simple clicks would be pretty slick!

I deffinetly see a future of better cross chat context connections and information being powerful. Basically git but for every conversation and cpde generated for a project.

Would be crazy hard but also crazy powerful.

If my startups blows up I might try something like that!


Cursor has checkpoints for this but I feel I’ve never used them properly; easier to reject all and reprint. I keep chats short.


Definitely a new skill to learn. Everyone I know that is having problems is just telling it what to do, not coaching it. It is not an automaton... instructions in code out. Treat it like a team member that will do the work if you teach it right and you will have much more success.

But is definitely a learning process for you.


Sounds like a Cursor issue


what language?


I need this, just finished 300GB of CSV extracts, and manipulating, data integrity checks, and so on take longer than they should.


Why wouldn't you use a data format meant to store floating point numbers?

HDF5 gives you a great way to store such data.


Sounds interesting, I'll give it a look. I'm unfortunately limited to CSV, XML, or XLS from the source system, then am transforming it and loading it into another DB.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: