Well, it's nice when phone is considered secure channel. It's not so for many serious applications, however. PGP invented to deal with situations when you communication channels are untrusted. See, no one says your software is bad, but when it is marketed as a better alternative to PGP it's not true, and worse, it's absolutely irresponsible thing to do.
According to parents nice talk[1] you can add a verify switch that lets you compare the signature of the actual key. So a public authenticated channel is enough.
I'm not sure we are on the same page here. Having control over a channel you use to pass your code, I can receive your secret file, I just need to be quicker than a legitimate recipient. How this '--verify' flag will help you then?
The assumption is that Alice recognizes the voice of Bob. If Eve manages to evasdrop on the call and sits in the middle or beats Bob to connect to the wormhole server then Alice will still see that the fingerprint that Bob dictates over the phone does not match the fingerprint of the key that her computer proposes to use for the file transfer. Alice will therefore abort the transmission.
With deep learning the voice may be not good enough nowadays. Still, you only need an authenticated - possibly public - channel, similar to pgp key exchange, where you can read the fingerprint over the phone.