Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Interesting! I used whipser last year to attempt to build an audio transcription tool but gave up due to excessive amount of hallucinated output no matter what model I used.

It would produce seemingly ok output until you started paying attention.

One example, it insisted that Biggie Smalls sings "Puttin five carrots in my baby girl ear". (its "carats").

It's apparently not useful in transcription as it don't reason [sic].



That example is not hallucination, it's just a homonym with insufficiently clear context for the model to disambiguate it.


I'm well aware mishearing "carots" as "carrots" is not a hallucination.

That's an example I gave after having used Whisper, the topic of discussion.


An example of what you claimed was a hallucination




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: