Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

This does not address the issue raised in iLoveOncall's third paragraph: "the same comment can be a nitpick on one CR but crucial on another..." In "attempt 2", you say that "the LLMs judgment of its own output was nearly random", which raises questions that go well beyond just nitpicking, up to that of whether the current state of the art in LLM code review is fit for much more than ticking the box that says "yes, we are doing code review."


If you are using an LLM for judgment, you are using it wrong. An LLM is good for generating suggestions, brainstorming, not making judgments.

That's why it is called Generative AI.


Indeed - and if you are doing code review without judgement, you are doing it wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: