I cannot say for your work but for classification steps and data structuring it’s quite accurate and this is with regular testing. I cannot speak for your work but for mine and folks in adjacent industries, LLM are fantastic and adding a lot of value to our workflows. You’re honestly holding on to this dead idea that LLM outputs are full of hallucinations. Throwaway account with throwaway comment.
It's "quite accurate" which is not acceptable for almost all relevant tasks is a business context. Somebody needs to manually check everything. Almost no time is saved.
Talking as someone who has built many small OpenAI integrations aka wrappers in business apps.
So I know you are trolling but let’s be real. Why would I share my business measurements with you on here? Of course my experience is an an anecdote and of course the measurements I use to determine accuracy are more in depth than “quite accurate”. What I can say is for structure we beat humans on accuracy and do it extremely quick. That LLM system runs at a lower cost than some of our more traditional system.
If all you have done is built wrappers I see you have bare scratches the engineering surface so that explains it.
It does save time for tasks where the output is easily checked, such as image generation and translations. But the quality is often mediocre.