Hacker Newsnew | past | comments | ask | show | jobs | submit | Madmallard's commentslogin

Idk I was using chat gpt 3.5 to do stuff and it was pretty helpful then

Doubt it.

People want to interact with other humans.

Hotel doorman problem etc.


Devs really don't want to work with tech writers to document their code though

I don't know if it's as nuanced as this.

Just seems like it's dependent on what you're working on and what training data is available for it.

AI definitely just spews out python and JavaScript for me to do all sorts of things quickly.

But it can't translate my XNA game to JavaScript worth a damn. It's terrible with visual work as well.


"The LLM should be able to determine if a task was completed successfully or not."

Writing logic that verifies something complex requires basically solving the problem entirely already.


Situation A) Model writes a new endpoint and that's it

Situation B) Model writes a new endpoint, runs lint and build, adds e2e tests with sample data and runs them.

Did situation B mathematically prove the code is correct? No. But the odds the code is correct increases enormously. You see all the time how the Agent finds errors at any of those steps and fixes them, that otherwise would have slipped by.


LLM generated tests in my experience are really poor

Doesn't change the fact that what I mentioned greatly improves agent accuracy.

AI-generated implementation with AI-generated tests left me with some of the worst code I've witnessed in my life. Many of the passing tests it generated were tautologies (i.e. they would never fail even if behavior was incorrect).

When the tests failed the agent tended to change the (previously correct) test making it pass but functionally incorrect, or it "wisely" concluded that both the implementation and the test are correct but that there are external factors making the test fail (there weren't).

It behaved much like a really naive junior.


Which coding agent and which model?

Those just don't appear at all on HackerNews

Gee I wonder why


Because most people don't work on public projects and can't share the code publicly?

What's more interesting is the lack of examples of non-trivial projects that are provably vibe-coded and that claim to be of high-quality.

I think many of us are looking for: "I vibe-coded [this] with minimal corrections/manual coding on a livestream [here] and I believe it to be high-quality code"

If the code is in fact good quality then the livestream would serve as educational material for using LLMs/agents productively and I guarantee that it would change many minds. Stop telling people how great it all is, show them. I don't want to be a naysayer, I want to be impressed.


I'm considering attempting to vibe code translate one of my XNA games to javascript and recording the process and using all of the latest tools and strategies like agents and .md files and multiple LLMs etc

"sufficient detail, and if done correctly" -> the machine will once again be able to interpret your intention ...

This does not actually follow from the way LLMs work.


Yeah grinding the domain expertise is definitely the play if you have the resources to do so.

They are stealing trillions in assets lol

And destroying the gaming industry and altering the energy grid and pooping on the environment


It was already shown repeatedly in GitHub repositories in the last year that authors are really unhappy with AI generated pull-requests and test cases.

I am not invested in anything, I am merely sharing my personal experience.

Lol there's definitely a war on hacker news

There's vested interests posting 20 replies in a single thread that benefits them and flagging replies that don't

There's literally 20-25% of dissenters comments in each of these posts being repeatedly flagged.


You're witch hunting.

I haven't flagged or downvoted anybody and I have no vested interest in anything. Not sure what my cause should be and what would be my benefit.

My profile contains my full name, you can search me, I'm a random freelancer, not somebody with any stakes in pushing AI.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: