If you made 1,000,000 AI-generated MP3s available and 0.03% of them matched Sony’s copyrighted music catalog then you would be liable for $250,000 * 30 in statutory damages. Arguing that infringement doesn’t occur because the incidence rate is low is wallpapering over a serious problem.
Really depends on the details. If the extractable works align with songs that are over-represent in the training data, they may largely consist of performances of public domain compositions. And if their extraction also requires prompting which explicitly requests an infringing result, then the actual liability might be something you could mount a defense for.
Or hell, the extracted potentially infringing and over-represented material might all be pop music set to variations of Pachelbel's Canon, and I'd pay to see that lawsuit.
Details matter here. For a given musical performance, there are at least 3 copyrights in action:
1. The copyright on the composition. This can also include arrangements - for instance, Gershwin's original piano version of Rhapsody in Blue is now public domain, but the orchestral version everyone knows,
2. The copyright on the sheet music (the actual layout, spacing, editorial notes, things like that.. it's actually an insanely deep subject. I've got an 800 page book on the subject - which is referred to as music engraving, as up until about 40 years ago it was literally done by engraving the plates by hand. Much much harder problem than doing normal book-style text layout, as it's fully 2D, whereas text is basically 1d with occasional special cases. (NB: This copyright is really only relevant to the musicians, conductors, etc, but it does matter.
3. The copyright of the particular recording. This is the really relevant one. A 5 year old recording of a 500 year old work is very much under copyright.
If that was the case, they'd be financing this stuff. Afaik they're not. What if you cross-referenced the sony copyrighted catalogue with 1 million traditional/public domain songs? I'd guess a 0.03% would be a rounding error.