Hacker Newsnew | past | comments | ask | show | jobs | submit | FanaHOVA's commentslogin

Everything in this comment is wrong lol

You know it's a EU study because they bring up "AI patents" in the first 2 minutes of it, as if they mean anything



oh i didn't know that claude code has a desktop app already


And it uses worktrees.


It isn’t its own app, but it’s built in to their desktop, mobile and web apps.


People can write horrible PRs manually just as well as they do with AI (see Hacktoberfest drama, etc).

"LLM Code Contributions to Official Projects" would read exactly the same if it just said "Code Contributions to Official Projects": Write concise PRs, test your code, explain your changes and handle review feedback. None of this is different whether the code is written manually or with an LLM. Just looks like a long virtue signaling post.


Virtue signaling? That seems like an uncharitable reading.

The point, and the problem, is volume. Doing it manually has always imposed a de facto volume limit which LLMs have effectively removed. Which I understand to be the problem these types of posts and policies are designed to address.


A large enough difference in degree becomes a difference in kind. Chat bots have vastly inflated the amount of shitty PRs, to the degree that it needs different solutions to manage.


Exactly. We never had a problem with spammy PRs before. Even at the height of Hacktoberfest, the vast majority were painfully obvious and confined to documentation. It was easy and obvious to reject those. But LLMs have really changed the game, and this policy was explicitly prompted by a number of big PRs that were obviously purely vibe-coded and we felt we really needed to get a defined policy out that we could point to and say "no, this is why we're rejecting this".


The effort to write shitty code is way less when you are using IA, you can create a 1k lines PR with a single prompt. This policy is important because no one is saying "we hate AI" but instead advises developers to use it with responsibility. This is coming in time since many people are using it without understanding problems and not being accountable regarding the contributions.


Are you saying that every piece of code you have ever written contains a full source list of every piece of code you previously read to learn specific languages, patterns, etc?

Or are you saying that every piece of code you ever wrote was 100% original and not adapted from any previous codebase you ever worked in or any book / reference you ever read?


While I generally agree with you, this "LLM is a human" comparisons really are tiresome I feel. It hasn't been proven and I don't know how many other legal issued could have solved if adding "like a human" made it okay. Google v Oracle? "oh, you've never learned an API??!?" or take the original Google Books controversy - "its reading books and memorizing them, like humans can". I do agree its different but I don't like this line of argument at all.


I agree, that's why I was trying to point out that saying "if a person did that we'd have a word for them" is useless. They are not people, and people don't behave like that anyway. It adds nothing to the discussion.


What's with the bad takes in this thread. That's two strawmen in one comment, it's getting a bit crowded.


Or the original point doesn't actually hold up to basic scrutiny and is indistinguishable from straw itself.


The original point, that LLMs are plagiarising inputs, is a very common and common sense opinion.

There are court cases where this is being addressed currently, and if you think about how LLMs operate, a reasonable person typically sees that it looks an awful lot like plagiarism.

If you want to claim it is not plagiarism, that requires a good argument, because it is unclear that LLMs can produce novelty, since they're literally trying to recreate the input data as faithfully as possible.


I need you to prove to me that it's not plagiarism when you write code that uses a library after reading documentation, I guess.

> since they're literally trying to recreate the input data as faithfully as possible.

Is that how they are able to produce unique code based on libraries that didn't exist in their training set? Or that they themselves wrote? Is that how you can give them the documentation for an API and it writes code that uses it? Your desire to make LLMs "not special" has made you completely blind to reality. Come back to us.


What?

The LLM is trained on a corpus of text, and when it is given a sequence of tokens, it finds a set of token that, when one of them is appended, make the resulting sequence most like the text in that corpus.

If it is given a sequence of tokens that is unlike anything in its corpus, all bets are off and it produces garbage, just like machine learning models in general: if the input is outside the learned distribution, quality goes downhill fast.

The fact that they've added a Monte Carlo feature to the sequence generation, which makes it sometimes select a token that is slightly less like the most exact match in the corpus does not change this.

LLMs are fuzzy lookup tables for existing text, that hallucinate text for out-of-distribution queries.

This is LLM 101.

If the LLM was only trained using documentation, then there would be no problem. If it would generate a design, look at the documentation, understand the semantics of both, and translate the design to code by using the documentation as a guide.

But that's not how it works. It has open source repositories in its corpus that it then recreates by chaining together examples in this stochastic parrot -method I described above.


K


No, you need to prove that it is not plagiarism when you use an LLM to produce a piece of code that you then claim as yours.

You have the whole burden of proof thing backwards.


Oh wild, I was operating under the assumption that the law requires you to prove that a law was broken, but it turns out you need to prove it wasn't. Thanks!


HN has guidelines for a reason.


You're adhering to an excess of rules, methinks!


One co trying: https://www.system.com


Because they make $60B/yr on advertising and car sales is a very valuable ad market.


> not truly groundbreaking foundation models.

Where is any proof that Yann LeCun is able to deliver that? He's had way more resources than any other lab during his tenure, and yet has nothing substantial to show for it.


The structure of each section gives away that it's mostly AI even without having to read the actual words. I'm sure it was AI + writer, but there's something about ending each section with 3-4 short, question-like sentences that is strongly AI. This is the same format as the successful LinkedIn slop so maybe it's not AI and just algo-induced writing.


Yup. It's the colons after every paragraph's first sentence:

> It worked because it solved a real problem: Kenyans were already sending money through informal networks. M-PESA just made it cheaper and safer.

> Here’s why this matters: M-PESA created a payment rail with near-zero transaction costs.

> The magic is this: You’re not buying a $1,200 solar system.

> It gets even better: there are people who will pay for credits beforehand.

It's just again and again and again. It's sounds 100% ChatGPT.

Maybe this is 100% written by hand by someone who reads too many ChatGPT-generated articles. Possibly the author just spends a ton of time chatting with ChatGPT and have picked up its style. Or it's just more AI-written than OP wants to admit.


We are so cooked. We spend more time trying to suss out if something was written by AI than actually reading the article. So many legitimate ways of writing are now “ai” style. I used to use emdash a lot, but now I deliberately avoid it because it’s an AI smell - using the less “correct” version instead. E


The equivalent of "If you have to ask, you can't afford it" here is "If you have to ask, you shouldn't do it".


Overall for the common person I'd agree, but I assume we're all more or less hackers here and for us, I'd say "If you have to ask, ask and learn, then do it".

If everyone followed your advice no one would ever do anything, as we all begin somewhere, something that should OK.

Of course, don't do million dollar trades when you begin, but we shouldn't push back on people wanting to learn, feels very backwards compared to hacker ethos.


we shouldn't push back on people wanting to learn but we should really point out very loudly that not fully understanding something like shorting can turn a small investment someone was fully ok with losing into a life altering bankruptcy due to a margin call.

Leverage can be a fearful thing.


Yup, I agree, be clear what the consequences are if you fuckup, allow people to fuckup if they wish.


> "If you have to ask, ask and learn..."

Totally! But also keep in mind this :)

https://www.explainxkcd.com/wiki/index.php/1570:_Engineer_Sy...


How about, "If you think an explanation from HN will explain it all to you, you're being naive about the complexities and risks"?


thank you for being nice


To expand on the original reply to you - shorting companies, or engaging in almost any stock-based activity beyond “buy and hold,” typically entails much, much higher risk than just buying and selling stock. The most you can lose when buying a share is the purchase price, and that’s fairly unlikely, but when you start getting into even options/etc, you’re magnifying your risk - small swings in the market can lead to large and disproportionate losses, and when you get into shorting in particular you can lose far more than your initial investment. This is why you’re getting the reaction you’re getting - because the thing you’re asking about is sufficiently risky that if you're asking on Hacker News (and not, say, asking a professional), you don’t understand the risk profile well enough to do it “safely.”

That, and because snarky answers get more imaginary internet points than helpful ones.


> you don’t understand the risk profile well enough to do it “safely.”

Since when is this a problem? For gods sake, let people fuck up and harm themselves if they're stupid enough to take the risks, or not.

I think it's fine to say "Remember, this is risky because of A, B and C, but here's how to do it anyways..." but straight up "If you have to ask, you shouldn't" seems so backwards and almost mean, especially when we talk about money which is mostly "easy come, easy go". Let the fool be parted with their money if that's what they want :)


I mean, there’s risk and there’s risk. If someone comes in asking “how do I mod my phone/ebike/toaster”, sure, caveat commentor and all that. If someone comes in asking “how do I make dioxygen difluoride,” that’s a different category of risk. OP can do whatever they want, but I’m not in the habit of giving guns to people who don’t know what they are without making sure they know which risk category they’re in.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: