To anyone trying this, does this unlock anything you tried to do with the past LLM models but failed and now you can try again? Do you find this as an incremental improvement or something that brings in new opportunities?
I haven't quite figured out if the open weights they released on huggingface amount to being able to run the (realtime) model locally - i hope so though! For the larger model with diarization I don't think they open sourced anything.
> We've worked hand-in-hand with the vLLM team to have production-grade support for Voxtral Mini 4B Realtime 2602 with vLLM. Special thanks goes out to Joshua Deng, Yu Luo, Chen Zhang, Nick Hill, Nicolò Lucchesi, Roger Wang, and Cyrus Leung for the amazing work and help on building a production-ready audio streaming and realtime system in vLLM.
In my opinion, the way this will play out is with a significant amount of validation and human oversight to fully utilize these LLMs. As you mentioned, I recommend giving the AI room for error and improving the experience of manually checking everything. Maybe create a tool to facilitate manually checking the output?
reply