As other mentioned, anything AI (Stable Diffusion for instance is 5.86, ComfyUI 5.19, then there is Whisper, OpenLyrics, LLM), ... this is not counting the model files of course.
Some client projects are packaged in a way that bring in a lot of unneeded dependencies. They could be repackaged to be 'lite' but no one thinks this is worth the time investment and is going to pay for it.
In almost all cases venv are needed because the dependencies are brittle and changing the python version or some package version break the program.
Considering that those sizes don't even include the models, what is actually taking up several gigabytes? How is it so much code? Or is it more than code?
I know it includes dependencies, but I'm still baffled.
Like the other comment mentions my venv also regularly reaches 5Gb thanks to deep learning libraries.
The Nvidia packages alone take 2.8G and torch another 1.4G. Numpy, transformers, matplotlib and two dependencies named sympy and triton add another 500 MB, and with that alone we're already at 4.7G.