Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The absolute worst place to be right now is in a B tech startup. Not only do you need to build some kind of app or product, you also need to build some kind of AI feature into the product. The users don't want it and never asked for it. It sucks all the resources out of your actual product that you should be focusing on, doesn't actually work or works non deterministically, but you are held to the same standards if it was another kind of software. And the only lever you have to pull is a lengthy model re-training or fine tuning/development cycle. The suits don't understand AI or what it takes to make it successful. They were sold on the hype that AI is going to save money, and forgot to budget for the team of AI engineers you'll need, infrastructure for training, extensive data annotations and reams of data that most startups don't have.

Tell me again how this isn't pure hell and the cuck chair?



> And the only lever you have to pull is a lengthy model re-training or fine tuning/development cycle.

Is this really how professionals work on such a problem today?

The times I'd had a tune the responses, we'd gather bad/good examples, chuck it into a .csv/directory, then create an automated pipeline to give us a percentage of success rate for what we expect, then start tuning the prompt, parameters for inference and other things in an automated manner. As we discover more bad cases, add them to the testing pipeline.

Only if it was something that was very wrong would you reach for model re-training or fine-tuning, or when you know up front the model wouldn't be up for the exact task you have in mind.


Got it, professionals don't fine tune their models and you can do everything via prompt engineering and some script called optimze.py that fiddles with API parameters for your call to OpenAI. So simple!


It depends. Fine-tuning is a significant productivity drag over in-context learning, so you shouldn't attempt it lightly. If you are working on low-latency tasks or need lower marginal costs, then fine-tuning a small model might be the only way to achieve your goals.


Agree for the most part but at the SaaS company I'm at, we've built a feature using LLMs to extract structured data from large unstructured documents. Not something that's been done well in this domain and this solution works better than any other we've tried.

We've kept the LLM constrained to just extracting values with context, and we show the values to end-users in a review UI that shows the source doc and allows them to navigate to exactly the place the doc where a given value was extracted. These are mostly numbers but occasionally the LLM needs to do a bit of reasoning to determine a value (e.g., is this X, Y or Z type of transaction where the exact words X, Y or X will not necessarily appear). Any calculations that can be performed deterministically are done in a later step using a very detailed, domain specific financial model.

This is not a chatbot or other crap shoehorned into the app. Users are very excited about this - it automates painful data entry and allows them to check the source - which they actually do, because they understand the cost of getting the numbers wrong.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: