Just thinking of it. 1% error rate, say 1 in 100 customers gets some wrong information. And they go to place trusting it. Just to hear that AI lied to them. Say you have 1000 or 10000 customers using system, now you potentially have 10 or 100 one star negative reviews... And this might be just answering to simple queries like a restaurant menu or opening time.
No decently coded chatbot is going to respond with an incorrect restaurant menu or opening time. You'd call a function to return the menu from a database or the opening time from a database. At worst, the function fails, but it's not going to hallucinate dishes.
Exactly this. Even for internal use. Our corp approved a small project where NN will do the analysis of nightly test runs (our test suite runs very long). For now it does classification of results into several exiting broad categories. Technically product type failures are the most important usually and this should allow to focus efforts on them. But since even 1% false rate (it is actually in double digits in real life) would mean that we, QAs, need to verify all results anyway. So no time saved, and this NN software is eh... useless.
There are other ideas how to make it more useful, but my point is that non-zero failure rate with unpredictable answers is not applicable to many domains.
Yeah, a 1% error rate (IME in practice the error rate is _much_ higher than this if you care about detail, but whatever) just won't fly in most use cases. You're really talking about stuff which doesn't matter at all (people are rarely willing to pay very much for this) or where its output is guaranteed to be reviewed by a human expert (at which point, in many cases, well, why bother).
As programmers we complain about the ~1% from copilot-type models where the code is terrible. It is annoying but you can live with it.
A 1% error rate for many things, and with no bounds of how terrible the hallucinated error is, perhaps that's unworkable.
For example, Air Canada ended up liable for a discount which its AI chatbot made up.