Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Absolutely. These are called "style tokens" and they're an active area of TTS research.

The problem is that currently your training data has to be annotated with these tokens, and that adds a lot to the difficulty of creating data sets.

I imagine that over time this will get much easier to do.



Are there good emotion detectors for speech-to-text? Much like they have for facial recognition?


I'm not aware of any, and I haven't had much time to look as I'm not to the point of doing style tokens yet. I'm certain this would be useful for annotating data and for all sorts of other applications. Sentiment analysis, etc.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: