zkstefan's comments

zkstefan · on June 21, 2024

> There aren't any open source audio-to-audio models yet

I think that's not true. See this for example: https://huggingface.co/facebook/seamless-m4t-v2-large It's not general purpose like GPT4o but translation still seems pretty useful

modeless · on June 21, 2024

I don't think SeamlessM4T qualifies as an end-to-end audio-to-audio model. The paper states "the task of speech-to-speech translation in SeamlessM4T v2 is broken down into speech-to-text translation (S2TT) and then text-to-unit conversion (T2U)". And while language translation is an important application as you mention, it's strictly limited to that. It wouldn't understand or produce non-speech audio (e.g. singing, music, environmental sounds, etc) and you can't have a conversation with it.

zkstefan · on Nov 11, 2022

I don't disagree that variable fonts require a lot of effort to create but there are hundreds of open source variable fonts available: https://fonts.google.com/?vfonly=true

zkstefan · on July 9, 2021

https://meowni.ca/font-style-matcher/

zkstefan · on June 16, 2021

This seems to contain the stats you were looking for https://www.fourth.com/wp-content/uploads/2019/10/US_Infogra...