Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I had a similar take until about a week ago. A friend showed me his workflow with Copilot and whatever Jetbrains AI assistant is.

Use it as a tool: what if instead of opening up a new tab, searching for the API docs for the library you're trying to find a function in, find the function, re-read the parameter arguments for the 400th time, and then use it, you could just highlight a snippet and say "Paginate the results from S3 using boto3" and the code would just populate?

You have to have the clarity of thought to know what you're doing, but the time it takes to write every line for basic stuff you've done 1000x before can be greatly compressed if it's inlined with your IDE.

I think this is the move for most LLM tools: integrate it with existing tooling. An LLM for Excel for corporate bookkeepers, CPAs, etc will be great. A Word/PDF summarizer that's tuned for attorneys will also be fantastic. Highlight a paragraph, ask for relevant case law, etc.

I thought ~2 years ago the results were... not great. Now I'm pretty happy with it.

SecureFrame (helps with compliance regimes like SOC2) recently added the ability to generate Terraform templates to automatically generate infrastructure that will fix specific platform risks for AWS, Azure, GCP, etc.

It definitely needs someone at the helm since it does hallucinate, but I have found it to cut down my time on mundane tasks or otherwise niche/annoying problems. When was the last time you visited 4+ StackOverflow posts to find your answer? Copilot, so far, has always hit a pretty close answer very quickly.



I also had to build intuition for when it will be appropriate versus not. It's hard to describe but one very positive signal is certainly "will any hallucination be caught in <30s"? Even in ChatGPT Plus you can have it write its own unit tests and run them in the original prompt (even in the profile's Custom Instructions so you don't have to type it all the time).

So a mistake was using it for something where runtime performance on dozens of quirky data files was critical; that nearly set my CPU on fire. But str>str data cleanup, chain of simple API calls, or some a one-off data visualization? chef kiss


> to write every line for basic stuff you've done 1000x before

There are ways to avoid writing basic stuff you've done 1000x before that are better than LLMs though...

Put it in a well-thought-out function or package or other form of shared/reusable code. You can validate it, spend the time to make sure it covers your edge cases, optimize it, test it, etc. so that when you go to reuse it you can have confidence it will reliably do what you need it to do. LLM-generated code doesn't have that.

(When you think about how LLMs are trained and work, you realize they are actually just another form of code reuse, but one where there are various transformations to the original code that may or may not be correct.)

Where LLMs shine for coding is in code-completion. You get the LLM output in little chunks that you can immediately review correctly and completely, in the moment: "yeah that's what I want" or "no, that's no good" or "ok, I can work with that". Not surprising, since predicting completion is what LLMs actually do.


I don't know exactly how you use it, but this isn't my experience at all. If you ask a LLM anything too specific, that isn't obvious and a common issue/discussion ( something that I almost never need to do), it just makes up nonsense to fill the space.

Equally, if you ask it general questions it misses information and is almost always incomplete, leaving out slightly more obscure elements. Again, I need comprehensive answers, I can come up with incomplete ones myself.

What's really obvious to me when I use it is that it's a LLM trained on pre-existing text, that really comes through in the character of its answers and its errors.

I've very glad others find them useful and productive, but for me they're disappointing given how I want to use them.


That's fair, it might not be for you. In 'old school ML', for a binary classifier, there's the concept of Precision (% of Predicted Positive that's ACTUALLY Positive) and Recall (% of ACTUALLY Positive that's Predicted to be Positive).

It sounds like you want perfect Precision (no errors on specific Qs) and perfect Recall (comprehensive on general Qs). You're right that no model of any type has ever achieved that on any large real-world data, so if that's truly the threshold for useful in your use cases, they won't make sense.


I just want something useful. I'm not talking perfection, I'm talking about answers which are not fit for purpose. 80% of the time the answers are just not useful.

How are you supposed to use LLMs if the answers they give are not salvageable with less work than answering the question yourself using search?

Again, for some people it might be fine, for technical work, LLMs don't seem to cut it.


Sorry if this is sophmoric, but when you said "you have to have clarity of thought" - what jumped to mind was the phrase "you have to speak to the code"... I thought it encapsulated your clarity of thought quite saliently for me.


You must be one with the code. You must be the code.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: