Thanks - it seems that Gemini Live is pretty far behind advanced voice mode at the moment. For example, I can't get it to speak slower when I want to understand what it is saying.
I'm still interested in what keyword I could use to search for the latest research in voice models.
If you want to try only voice, Try unmute.sh by Kyutai which will be eventually open-sourced