I fine-tuned GPT-3 on the lyrics of 96 MF DOOM songs to write lyrics in his style. Then, i cloned his voice to a text-to-voice model to make him rap them.
You will probably get better results using voice2voice models like RVC or Bark, that way the model is on beat for instace. This is how most AI covers of songs are made I believe. It's also easy to train an RVC model and use it on Google colab. Takes about 1-2 hours to train and 16 minute ebook can be generated in about 170 seconds for instance.