Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Flat out impossible? If you mean “without clicking anything”, sure, but you could interrupt with your thumb, exit chat to send images and go back (maybe video too, I’ve never had any need), and honestly the 2-3 second response time never once bothered me.

I’m very excited about all these updates and it’s really cool tech, but all I’m seeing is quality of life improvements and some cool engineering.

That’s not necessarily a bad thing. Not everything has to be magic or revolutionary to be a cool update



Did you even watch the video ? It's just baffling how I have to spell this out.

Skip to 11:50 or watch the very first demo with the breathing. None of that is possible with TTS and STT. You can't ask old voice mode to slow down or modulate tone or anything like that because it's just working with text.


Yes I watched the demo. True those things were not possible, so if that’s what’s blowing you away then fair enough I guess. For me that doesn’t impact at all anything have ever used voice for or probably will ever use voice for.

I’ve voice chatted with ChatGPT for hundreds of hours and never once thought “can you modulate your tone please?”, so those improvements are a far cry from magic or revolutionary imho. Again, that’s not to say they aren’t cool tech, forward advancements, or impressive —- but magic or revolutionary are pretty high bars.

To each their own though.


Few people are going to say "modulate your tone" in a vacuum sure but that doesn't mean that ability along with being able to manipulate all other aspects of speech isn't an incredible advance that is going to be very useful.

Language learning, audiobook narration that is far more involved, you could probably generate an audio drama, actual voice acting, even just not needing to get all my words in before it prompts the model with the transcribed text, conversation that doesn't feel like someone is reading a script.

And that's just voice.

This is the kind of interaction that's possible now. https://www.youtube.com/watch?v=_nSmkyDNulk

And no, thumbing the pause button, sending an image and going back does not even begin to compare in usability.

Great leaps in usability are a revolution in itself. GPT-3 existed for years so why did ChatGPT explode when it did? You think it was intelligence? No. It was the usability of the chat interface.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: