Hacker Newsnew | past | comments | ask | show | jobs | submit | zkstefan's commentslogin

> There aren't any open source audio-to-audio models yet

I think that's not true. See this for example: https://huggingface.co/facebook/seamless-m4t-v2-large It's not general purpose like GPT4o but translation still seems pretty useful


I don't think SeamlessM4T qualifies as an end-to-end audio-to-audio model. The paper states "the task of speech-to-speech translation in SeamlessM4T v2 is broken down into speech-to-text translation (S2TT) and then text-to-unit conversion (T2U)". And while language translation is an important application as you mention, it's strictly limited to that. It wouldn't understand or produce non-speech audio (e.g. singing, music, environmental sounds, etc) and you can't have a conversation with it.


I don't disagree that variable fonts require a lot of effort to create but there are hundreds of open source variable fonts available: https://fonts.google.com/?vfonly=true



This seems to contain the stats you were looking for https://www.fourth.com/wp-content/uploads/2019/10/US_Infogra...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: