Are these multimodals able to discern the input voice tone? Really curious if th... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		gsuuon on May 13, 2024 \| parent \| context \| favorite \| on: GPT-4o Are these multimodals able to discern the input voice tone? Really curious if they're able to detect sarcasm or emotional content (or even something like mispronunciation?)

bigyikes on May 13, 2024 [–]

Yes, they can, and they should get better at this over time.

There is a demo video where the presenter breathes heavily and asks the AI is able to notice it as such when prompted.

It can’t just detect tone, it seems to also be able to use tone itself.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact