How Python virtual environments work

_coveredInBees · on March 13, 2023

I'm surprised at the number of people here complaining about venvs in Python. There are lots of warts when it comes to package management in Python, but the built-in venv support has been rock solid in Python 3 for a long time now.

Most of the complaints here ironically are from people using a bunch of tooling in lieu of, or as a replacement for vanilla python venvs and then hitting issues associated with those tools.

We've been using vanilla python venvs across our company for many years now, and in all our CI/CD pipelines and have had zero issues on the venv side of things. And this is while using libraries like numpy, scipy, torch/torchvision, etc.

whalesalad · on March 13, 2023

I've been using Python since like 2006, so maybe I just have that generational knowledge and battlefront experience... but whenever I come into threads like this I really feel like an imposter or a fish out of water. Like, am I using the same Python that everyone else is using? I echo your stance - the less overhead and additional tooling the better. A simple requirements.txt file and pip is all I need.

johtso · on March 14, 2023

Isn't pip + requirements.txt insufficient for repeatable deployments? You need to pin all dependencies not just your immediate project dependencies, unless you want some random downstream update to break your build. I guess you can do that by hand.. but don't you kind of need some kind of a lock file to stay safe/sane?

oblvious-earth · on March 14, 2023

The simple solution to this with pip is constraints file:

    pip install <my_entire_universe_of_requirements>
    pip freeze > constraints.txt

And now in any new environment:

    pip install <any_subset_of_requirements> -c constraints.txt

Now you can install prod requirements or dev requirements or whatever other combination of requirements you have and you guarantee to have the exact same subset of packages, no matter what your transitive dependencies are doing.

You can use pip-compile from pip-tools if you want the file to include exact hashes.

MrJohz · on March 14, 2023

This is true, but now you're explicitly depending on all of your transitive dependencies, which makes updating the project a lot harder. For example, if a dependency stops pulling in a transitive dependency past a certain version, you'll need to either recreate the constraints file by reinstalling everything, or manually remove the dependencies you don't need any more.

Also pip freeze does not emit a constraints file, it emits (mostly) a requirements file. This distinction is rarely important, but when it is, it can cause a lot of problems with this workflow. For example, a constraints file cannot include any information about which extras are installed, which pip freeze does by default. It also can't contain local or file dependencies, so if you have multiple projects that you're developing together it simply won't work. You also can't have installed the current project in editable mode if you want the simple "pip freeze" workflow to work correctly (although in practice that's not so difficult to work around).

Pip-tools does work a bit better, although the last time I used it, it considered the dependency chains for production and for development in isolation, which meant it would install different versions of some packages in production than in development (which was one of the big problems I was trying to solve).

From my experience trying basically every single option in the packaging ecosystem, there aren't really any solutions here. Even Poetry, which is pretty much best-in-class for actually managing dependencies, struggles with workspace-like installations and more complicated build scripts. Which is why I think pretty much every project seems to have its own, subtly unique build/dependency system.

Compare and contrast this with, say, NPM or Cargo, which in 95% of cases just do exactly what you need them to do, correctly, safely, and without having to think about it at all.

oblvious-earth · on March 16, 2023

> This is true, but now you're explicitly depending on all of your transitive dependencies

They're constraints not dependencies they don't need to be installed and you can just update your requirements as you need and regenerate them.

> Also pip freeze does not emit a constraints file, it emits (mostly) a requirements file. This distinction is rarely important, but when it is, it can cause a lot of problems with this workflow. For example, a constraints file cannot include any information about which extras are installed, which pip freeze does by default

Pip freeze does not use extras notation, you just get extra packages listed as individual dependencies. Yes there is an important distinction between constraints and requirements but Pip freeze uses an intersecting subset of the notation.

> You also can't have installed the current project in editable mode if you want the simple "pip freeze" workflow to work correctly

That's why the workflow I gave to generate the constraints didn't use the -e flag, you generate the constraints separately and then can install however you want, editable or not.

> From my experience trying basically every single option in the packaging ecosystem, there aren't really any solutions here. Even Poetry, which is pretty much best-in-class for actually managing dependencies, struggles with workspace-like installations and more complicated build scripts. Which is why I think pretty much every project seems to have its own, subtly unique build/dependency system.

People have subtly different use cases that make a big impact on what option is best for them. But I've never been able to fit Poetry into any of my use cases completely, whereas a small shell script to generate constraints automatically out of my requirements has worked exceedingly well for pretty much every use case I've encountered.

bozobuttz · on March 14, 2023

I have been using pip since 2014 and did not know about the constraints. This solves my issue with sub-dependency version pinning!

SnowflakeOnIce · on March 14, 2023

'pip freeze' will generate the requirements.txt for you, including all those transitive dependencies.

It's still not great though, since that only pins version numbers, and not hashes.

You probably don't want to manually generate requirements.txt. Instead, list your project's immediate dependencies in the setup.cfg/setup.py file, install that in a venv, and then 'pip freeze' to get a requirements.txt file. To recreate this in a new system, create a venv there, and then 'pip install -c requirements.txt YOUR_PACKAGE'.

The whole thing is pretty finicky.

selcuka · on March 14, 2023

Not sure why repeatable deployments would be a problem. You can pin all dependencies by issuing a

    pip freeze > requirements.txt

if you want. The only catch is you should be using a similar architecture and Python version in both development and production.

This would also pin a few non-project dependencies such as `disttools` but that shouldn't be a problem.

Edit: TIL that pip constraints is a thing. See the comment posted by oblvious-earth for an even better approach.

coldtea · on March 13, 2023

Is it "generational knowledge and battlefront experience" or just "getting used to the (shitty) way things have always been" and Stockhold Syndrome?

chpatrick · on March 13, 2023

It was pretty bad before but now it seems like there are a bunch of competing solutions each with their own quirks and problems. It feels like the JavaScript ecosystem.

MrJohz · on March 14, 2023

Ironically, the Javascript ecosystem is far better than the Python ecosystem when it comes to packaging and dependencies. NPM just does the right thing by default: you define dependencies in one place, and they are automatically fixed unless you choose to update them. Combine that with stuff like workspaces and scripts, and you basically have everything you need for the vast majority of use cases.

Yes, there's also other options like Yarn, which have typically had newer features and different approaches, but pretty much everything that works has been folded back into NPM itself. Unless you really want to live at the bleeding edge for some reason, NPM is perfectly sufficient for all your needs.

In contrast, the closest thing to that in the Python ecosystem is Poetry, which does a lot of things right, but is not supported by Python maintainers, and is still missing a handful of things here and there.

I'm not saying the JS ecosystem as a whole is perfect, but for packaging specifically, it's a lot better than Python.

bombolo · on March 14, 2023

> they are automatically fixed unless you choose to update them

That's a good way to never get vulnerabilities fixed.

It hardly seems like "the right thing" to me.

MrJohz · on March 14, 2023

I mean, a project needs regular care and maintenance, however you organise it. If you're never scheduling time to maintain your dependencies, you're going to be in trouble either way. But at least if you lock your dependencies, you know what will actually get installed, and you can find the buggy or insecure versions.

We found a bug on a Python project I worked on recently that only seemed to happen on certain machines. We couldn't reproduce it in a dev environment, and one machine the was affected suddenly stopped being affected after a while. It turns out the issue was a buggy dependency: one particular build of the project happened to have picked up the buggy version, but later builds used the fixed version and so didn't have a problem. So we'd only see the bug depending on which build the machine had last used, and if someone put a different build on there, it would reset that completely. On our development machines, we used slightly different builds that just happened but to have been affected.

Pinning dependencies wouldn't necessarily have prevented the bug in the first place - sometimes you just have buggy dependencies - but the debugging process would have gone much more quickly and smoothly with a consistent build environment. We could also have been much more confident that the bug wouldn't accidentally come back.

bombolo · on March 14, 2023

You should really start using linux distributions. These problems are all solved and have been solved for a long time.

MrJohz · on March 14, 2023

That's definitely a solution, but it comes with its own problems, in particular that you add a significant dependency on what is essentially a middleman organisation trying to manage all possible dependencies. This doesn't scale very well, particularly because there's a kind of M×N problem where M packages can each have N versions which can be depended on. In practice, most distros tend to only support one version of each package, which makes the job easier for the distro maintainer, but makes things harder for everyone else (library authors get bug reports for problems they've already fixed, end users have less ability to choose the versions they need, etc).

In particular, it also makes upgrading a much more complex task. For example, React releases new major versions on a semi regular basis, each one containing some breaking changes, but not many. Ideally there wouldn't be any, but breaking changes are inevitable with any tool as situations change and the problem space becomes better understood. But because the NPM ecosystem generally uses locked dependency lists, end users can upgrade at their leisure, either with small changes every so often, or only upgrading when there's a good reason to do so. Both sides can be fairly flexible in how they do things without worrying about breaking something accidentally.

Under a Linux distribution model however, those incremental breaking changes become essentially impossible. But that means that either projects accumulate cruft that can't ever be removed and makes maintainers' and users' lives more complex, or projects have to do occasional "break everything" releases a là Python 2/3 in order to regain order, which is also more work for everyone. There is a lot less flexibility on offer here.

I don't think these sorts of problems disqualify the Linux distribution model entirely - it does do a lot of things well, particularly when it comes to security and long-term care. But there's a set of tradeoffs at play here, and personally I'd rather accept more responsibility for the dependencies that I use, in exchange for having more flexibility in how I use them. And given the popularity of language-specific package repositories that work this way, I get the feeling that this is a pretty common sentiment.

ElectricalUnion · on March 14, 2023

What happens when your distribution only have old versions, or worse, no versions of the libraries you need? You hoop distribution? You layer another distribution like Nix or Anaconda over your base distribution? You give up and bundle another entire distribution in a container image?

bombolo · on March 14, 2023

You make a package for the thing you need.

ElectricalUnion · on March 16, 2023

So the "solution to packages" is to make your own package with someone's else package?

If it's that simple, how come no one already did all that work for us?

coldtea · on March 14, 2023

It's 200% is "the right thing".

Updating packages should be strictly left to the developer's discretion. That schedule is up to the developer using the packages, not upstream.

Not to mention that dependencies updating themselves whenever they like to "fix vulnerabilities" is a sure-fire way to break your program and introduce breakage and vulnerabilities in behavior...

coldtea · on March 14, 2023

The Javascript ecosystem for other things, like frameworks, sure.

But when it comes to packages and "virtual envs" the Javascript ecosystem is leaps and bounds better.

ElectricalUnion · on March 14, 2023

The "Javascript ecosystem" on my personal experience seems to prefeer installing everything in the global environment "for ease of use convenience" and then they wonder how did a random deprecated and vulnerable dependency get inside their sometimes flattened, sometimes nested, non-deterministic dependency chain (I wish the deterministic nested pnpm was the standard...) and (pretend) they did not notice.

That being said, the Javascript ecosystem has standarized tooling to handle that (npx) that Python doesn't (I wish pipx was part of standard pip), they just pick the convenient footgun approach.

ospider · on March 14, 2023

I don't think so. Python is battery included, and most packages in the Python ecosystem are not as scattered as npm packages. The number of packages in a typical Python project is much smaller than a Nodejs project. I think that's the reason why people are still happy with simple tools like pip and requirements.txt.

coldtea · on March 14, 2023

People are happy?

It's one of the major sources of disatisfaction with Python!

Doxin · on March 14, 2023

or the third option: did the whole packaging nonsense actually get kinda alright lately?

coldtea · on March 14, 2023

There's a PEP to get a part of it right [1] - at least the installation of dependencies and the need for virtualenv side, but atm the packaging nonsense is still as bad as it always has been.

https://peps.python.org/pep-0582/

Sample comment from its discussion:

>> Are pip maintainers on board with this? > Personally, no. I like the idea in principle, but in practice, as you say, it seems like a pretty major change in behaviour and something I’d expect to be thrashed out in far more detail before assuming it’ll “just happen”.

As if the several half-arsed official solutions already existing around packaging (the several ways to build and create packages) had deep thinking and design behind them...

anongraddebt · on March 13, 2023

Twice bricking my laptop’s ability to do python development because of venv + symlink bs was the catalyst I needed to go all-in on remote dev environments.

I don’t drive python daily, but my other projects thank Python for that.

kermatt · on March 14, 2023

How do you brick a machine with venvs?

bombolo · on March 14, 2023

He runs pip as root and doesn't use venvs.

anongraddebt · on March 14, 2023

By debugging homebrew issues during Moneterey updates.

I didn’t brick the machine, just the ability to setup a typical python venv.

mixmastamyk · on March 14, 2023

System administration skills are necessary to be productive developer. There is nothing re: python that can't be fixed with a few shell commands.

ElectricalUnion · on March 14, 2023

If a rogue package rm's your root directory as root, you need a bit more that a few shell commands to fix it.

mixmastamyk · on March 14, 2023

Can't happen unless you install as root. You're NOT DOING THAT are you?

Also LiveCDs are a thing for about twenty years. Recovery has never been easier, even after hardware failure.

ElectricalUnion · on March 16, 2023

It doesn't even need root to cause damage most of the time, it just needs to overwrite all files under your user by mistake.

> Recovery has never been easier, even after hardware failure.

If you can use a LiveCD to repair it, it most likely wasn't a hardware failure to start.

cwsx · on March 14, 2023

I've managed to break venv, npm and composer (php).

I don't use that as a reason to choose what I'll use in my projects, that's decided by the PTSD incurred from 7 years of php.

benhurmarcel · on March 14, 2023

It's really inconvenient for simple use cases. You don't even get a command to update all packages.

crabbone · on March 13, 2023

Lol. You put "simple" and "requirements.txt" unironically next to each other...

I mean, I think you genuinely believe that what you suggest is simple... so, I won't pretend to not understand how you might think that. I'll explain:

There's simplicity in performing and simplicity of understanding the process. It's simple to make more humans, it's very hard to understand how humans work. When you think about using pip with requirements.txt you are doing the simple to perform part, but you have no idea what stands behind that.

Unfortunately for you, what stands behind that is ugly and not at all simple. Well, you may say that sometimes it's necessary... but, in this case it's not. It's a product of multiple subsequent failures of people working on this system. Series of mistakes, misunderstandings, bad designs which set in motions processes that in retrospect became impossible to revert.

There aren't good ways to use Python, but even with what we have today, pip + requirements.txt is not anywhere near the best you can do, if you want simplicity. Do you want to know what's actually simple? Here:

Store links to Wheels of your dependencies in a file. You can even call it requirements.txt if you so want. Use curl or equivalent to download those wheels and extract them into what Python calls "platlib" (finding it is left as an exercise for the reader) removing everything in scripts and data catalogues. If you feel adventurous, you can put scripts into the same directory where Python binary is installed, but I wouldn't do that if I were you.

Years of being in infra roles taught me that this is the most reliable way to have nightly builds running quietly and avoiding various "infra failures" due to how poorly Python infra tools behave.

akprasad · on March 13, 2023

What are specific problems you have with pip + requirements.txt, and why do you believe storing links to wheels is more reliable? Your comment makes your conclusion clear, but I don't follow your argument.

crabbone · on March 14, 2023

Pip is a huge and convoluted program with tons of bugs. It does a lot more than just download Python packages and unpack them into their destination. Obviously, if you want something simple, then HTTP client, which constitutes only a tiny fraction of pip would be a simpler solution, wouldn't it?

In practice, pip may not honor your requirements.txt the way you think it would. Even if you require exact versions of packages (which is something you shouldn't do for programs / libraries). This is because pip will install something first, with its dependencies, and then move to the next item, and then this item may or may not match what was already installed.

The reason you don't run into situations like this one often enough to be upset is because a lot of Python projects don't survive for very long. They become broken beyond repair after few years of no maintenance. Where by maintenance I mean constant chasing of the most recent set of dependencies. Once you try to install and older project using pip and requirements.txt, it's going to explode...

TheRealPomax · on March 13, 2023

Except when you try to move it, or copy it to a different location. This _almost_ made sense back when it was its own script, but it hasn't made sense for years, and the obstinacy to just sit down and fix this has been bafflingly remarkable.

("why not make everyone install their own venv and run pip install?" because, and here's the part that's going to blow your mind: because they shouldn't have to. The vast majority of packages don't need compilation, you just put the files in the right libs dir, and done. Your import works. Checking this kind of thing into version control, or moving it across disks, etc. etc. should be fine and expected. Python yelling about dependencies that do need to be (re)compile for your os/python combination should be the exception, not the baseline)

Wowfunhappy · on March 14, 2023

> Except when you try to move it, or copy it to a different location.

Or just, y'know, rename the containing folder. Because last night I liked the name `foo` but this morning I realized I preferred `bar`, and I completely forgot that I had some python stuff inside and now it doesn't work and I have to recreate the whole venv!

riffraff · on March 14, 2023

Why does that break venv? I thought it'd be linking to things outside of itself but shouldn't be aware of where it is.

(Sorry, not a python expert)

Doxin · on March 14, 2023

When creating the venv it hardcodes some paths so the python interpreter knows where to find its modules and the likes.

That said, re-creating a venv shouldn't be hard and if it is you're doing something wrong in your development setup.

Wowfunhappy · on March 14, 2023

What am I doing wrong? As far as I can tell, I have to:

1. Copy my code out from the venv folder

2. Delete the venv folder

3. Make a new venv

4. Copy my code back into the new venv folder

5. Re-install dependencies

This doesn't take much longer than 60 seconds, but that's 55 seconds more than I want to spend. How is this a good process? It just makes me avoid using python (at least when I'd need anything outside the standard library).

Is there a simple command that will do this all for me?

Note that I don't typically have a git repository or similar set up because I use python for very simple semi-throw-away scripts. I just want to be able to rename the containing folder and have my script still work.

Doxin · on March 16, 2023

Your code should not be inside the venv folder. For reference my projects usually look something like this:

     project
     ├─ venv
     |  ╰─ ...
     ├─ pyproject.toml
     ╰─ project
        ├─ __init__.py
        ├─ __main__.py
        ╰─ app.py

Which means recreating the venv is as easy at removing the venv folder, creating a new venv, and running `pip install -e .` when using pyproject.toml or `pip install -r requirements.tx` when using a requirements file.

This of course doesn't quite solve the moving the folder issue, for which unfortunately there isn't an amazing solution currently. One thing you can do is have the venv somewhere else entirely, That way you can keep the venv in a fixed place so it doesn't break but still move the code to wherever you want to put it. In the use-case for tiny scripts like you do you might be better served not using a venv, and just using `pip install --local` for all your packages. Which is a bit messy but has served me for years and years before I landed on the pattern I use now.

Another "unfortunately" is that none of this stuff is documented very well. Writing a working pyproject.toml for example requires switching between the PEP introducing them, the pip documentation, and the setuptools documentation.

TheRealPomax · on March 14, 2023

The frustration is more than real.

fbdab103 · on March 14, 2023

I have drunk the Python kool-aid for too long, but you are absolutely right that this should be corrected.

TheRealPomax · on March 14, 2023

Every now and then I wake up from the kool-aid stupour because I've been using a different programming language and ecosystem for a while and coming back to Python is just an excruciating exercise in "why is this still so shit?" (who wants to talk about pip vs npm when you're a package maintainer? Anyone?)

haskellandchill · on March 14, 2023

> Except when you try to move it, or copy it to a different location.

The article says it is explicitly not designed for that: "One point I would like to make is how virtual environments are designed to be disposable and not relocatable."

TheRealPomax · on March 14, 2023

Good job, you spotted the exact problem: that's what they were originally for, and that makes no sense in 2023 (or 2020, or the day venv functionality was added into main line Python) where literally everything a venv does is trivially achieved in a new location if it didn't hardcode everything relating to paths.

There is literally nothing about a venv that somehow magically makes it impossible to still work after relocation. Breaking the venv on relocation was a conscious choice that has been insisted on to this day for no good reason other than "a long history of not bothering to fix this nonsense is all the justification we need to continue not fixing this nonsense".

nodemaker · on March 14, 2023

Jeez just clone a venv with venv cloning tools

TheRealPomax · on March 14, 2023

You mean `python -m venv`? Because that's literally that with just as little effort, and then you copy requirements.txt, but the whole point is that in 2023, this should not be necessary and the continued insistence by both python and specifically venv maintainers that it somehow needs to be this way, is insane. And telling people that they should just use clone tools is equally insane when we could just...

...you know...

...fix virtual environments?

nodemaker · on March 14, 2023

No there is a whole python module for cloning venvs.

9dev · on March 14, 2023

Which serves to highlight the insanity of it all? How is having a separate module for that not even worse than, as OP suggested - fixing virtual environments?

black3r · on March 13, 2023

> Most of the complaints here ironically are from people using a bunch of tooling in lieu of, or as a replacement for vanilla python venvs and then hitting issues associated with those tools.

That's because the vanilla python venvs feel like a genius idea but not thought out thoroughly, they feel as if there's something missing..., So there's naturally lots of attempts at improvements and people jump at those...

And when you think about it in bigger depth, venvs are themselves just another one of the solutions used to fix the horrible mess that is python's management of packages and sys.path...

The "Zen of Python" says "There should be one-- and preferably only one --obvious way to do it.", so I can't understand why it's nowhere near as easy when it comes to Python's package management...

JohnFen · on March 13, 2023

Honestly, virtual environments are one of the reasons why I prefer to avoid Python whenever I can.

earthboundkid · on March 14, 2023

To me, the virtual environments are a symptom. The cause is people defending Python even when it’s not as good as the alternatives. Every language has flaws. Every language has things it can learn from other languages. Every language is a trade off of different features. But somehow Python packaging, despite being really unpleasant compared to other languages and quite stagnant, is defended very vigorously in these threads (which lol are constantly recurring, unlike other languages). Just last week, I tried to install a Python program from 2020. I failed. I think the problem is that it relied on Pandas and maybe Pandas from then doesn’t work? I have no idea what the real flaw was, but jeez, it’s annoying to have people act like there is no problem. Yes, there is a big problem! This is a dead parrot.

bombolo · on March 14, 2023

why?

Also, you don't have to use them.

9dev · on March 14, 2023

Which leaves you with what, not installing packages or sharing packages between everything on my system and all apps I work on? The 80ies just called, they want their development methodologies back.

bombolo · on March 14, 2023

Is this a response?

I asked what's wrong with a venv and I got a rant…

JohnFen · on March 14, 2023

It strikes me that virtual environments are a fairly ugly hack to smooth over the fact that Python is not a stable language. It changes a lot, requiring the use of particular runtimes for particular Python code, requiring the installation of multiple runtimes.

That's a pretty serious downside to the language. Virtual environments are needed to help people deal with that downside.

bombolo · on March 14, 2023

Uh? To me they're just a convenient quick way to install stuff that I don't want to install system-wide, if I want to do some quick experimenting.

The normal, permanent, stuff gets installed system wide the normal way, with apt.

ElectricalUnion · on March 14, 2023

Well, if develop and ship in a vm/container, you don't have to do it in your system /s

aflag · on March 13, 2023

It's incredibly lacking in features. PyPI doesn't even properly index packages, making pip go into this dependency resolution he'll trying to find a set of versions that will work for you. It works for simple cases with few dependencies/not a lot of pinning. But if your needs are a bit more complex it certainly shows its rough edges.

I actually find it amazing that they python community puts up with that. But I suppose fixing it is not that pressing now the language is widely adopted. It's not going to be anyone's priority to mess with that. It's high risk low rewards sort of project.

whalesalad · on March 13, 2023

I've been writing Python for a looong time. I have pushed out thousands and thousands of deployments across probably 40+ distinct Python codebases and only once or twice have I ever encountered a showstopper dependency resolution issue. At the end of the day you should want to have fine grained control over your deps and frankly there are many times where a decision cannot be automatically made by a package manager. Pip gets beat on hard but it puts in work all day every day and rarely skips a beat. It's entirely free and developed with open source contributions.

Areas where I have felt a lot of pain is with legacy Ruby projects/bundler. Don't get me started on golang.

Can pip be made better? Sure. Should we have an attitude of disgust towards it? Heck no!

crabbone · on March 13, 2023

> once or twice have I ever encountered a showstopper dependency resolution issue.

Hahaha... (rolls on the floor) Do you want to know why that is? No seriously? I'm not laughing at you as much as I'm laughing at Python now, but hey, well, anyways, do you want to know why that happened to you? I know you don't. But I'll tell you anyways!

Until quite recently, pip didn't give a rat's ass if the dependencies it installed were consistent. It would blink a message in the long stream of vomit it spills on the screen saying something like "you have package X installed of version Y, but package Z wants X of version Q, which will not be installed". And happily streamed more garbage to your screen.

It was an issue that was filed against pip for something like 12 years until it got resolved about a year or so ago. Even after it got resolved a lot of people tried to upgrade, saw that that would "break" their deployment, and rolled back to the latest broken version.

Things are sort of improving gradually since then, but we are light years away from the system working properly, and I know you don't want to know why, but I'll tell you anyways!

So, when for whatever reason pip doesn't find a dependency it thinks you need, a lot of packages, when they roll out their "releases", they upload also what Python calls "source release". Which should have never been treated as an installation option, but it is, and is treated like that by default. So, what will happen once pip finally gives up on finding a match, right, you guessed it! -- It's going to try to build it! Installing build dependencies along the way. What you get in the end is anyone's guess, but most likely, it's something broken because the developers who made this release didn't make a release specifically for your version.

Don't despair. There's a flag you can use with pip install that should prevent it from trying to build stuff. But two bad things will happen to you if you use it: in any non-trivial project your dependencies will irreparably break. And, who knows if that flag is implemented correctly... nobody in the real world is using that. So, who knows, maybe it'll format your hard drive along the way.

woodruffw · on March 13, 2023

I understand that you're being hyperbolic for rhetorical purposes, but I think you're overselling the problem with source distributions: most language package ecosystems boil down to the same "baseline" package representation as sdists do, and have the same basic "build it if no binary matches" approach. Many don't even provide built distributions; Rust and Go come to mind.

Python's problem isn't with source distributions as such, but with really bad metadata control (and excessively permissive approaches to metadata generation, like arbitrary Python setup files). Better metadata makes source-based language package management work just fine in every other language's ecosystem; much of the effort in Python packaging over the last ~8 years has been slowly turning Python in that direction.

crabbone · on March 14, 2023

> Python's problem isn't with source distributions as such, but with really bad metadata control

One doesn't preclude the other. I'm not against having a mechanism for automating source installs (like this is done in, eg. RHEL-based distros), but it's insanity if you allow this to happen by default. You may not remember Bumblebee deleting /usr while running some innocuously-looking code during install, but things happen... really bad things...

Things don't need to happen all the time in order for them to be scary. It's enough to have possible catastrophic consequences, even if the event itself is rare.

> Better metadata makes source-based language package management work just fine in every other language's ecosystem

I haven't seen a single one, and I used dozens at this point. This is never a good idea. It's OK to do source installs for development, it's never a good idea to do source installs for deployment. It "works" in other places because of how it's presented (i.e. nobody expects this to be the method of software delivery to the end user). Like, eg. in Cargo (Rust): you, as a developer, download sources and build programs from all the sources on your computer, but your user gets a binary blob they put on the system path and run. It would be insanity and a security nightmare if users were supposed to compile program code before they could run it. The select few who can audit what's being downloaded and how it's been compiled would probably manage, the rest would become victims of all sorts of scams or just random failures propagating beyond their builds into their systems.

> much of the effort in Python packaging over the last ~8 years has been slowly turning Python in that direction.

I'm sorry, but PyPA is managed by clueless people. Whatever they do there only breeds more insanity over time. They neither have a general direction where they want to take the packaging system, nor do they understand the fine details of it. They are also bombarded by insane requirements for useless and harmful features, which they often quick to implement... It's a circus what's going on there. I've lost hope years ago, and now I've become an accelerationist. I just like to see it burn and people run around screaming while their backs are on fire. I get paid to fix this mess. So, PyPAs incompetence is my job security.

muhokutan · on March 13, 2023

It behaves like a kid you send to the store with a 100 dollars

kapilvt · on March 13, 2023

I moved to poetry (ergonomics) and publishing wheels with frozen requirements, at least for apps, here's the plugin I used https://github.com/cloud-custodian/poetry-plugin-freeze .. readme has details, tldr freeze the graph for repeatability regardless of tool, Ala pip install works years later.

rxhernandez · on March 13, 2023

> only once or twice have I ever encountered a showstopper dependency resolution issue

I've encountered them with other languages and they're the sort of thing where one time is more than enough to make me feel like it could get me fired; they're Never (with a capital N) okay imo

Groxx · on March 13, 2023

What does that have to do with venvs?

I agree the packaging and distribution setup in python is an absolute mess, but that's entirely unrelated to venvs. It's like bringing up how python uses whitespace instead of curly-braces.

lmm · on March 13, 2023

venvs are the recommended workaround for the fact that python packaging and distribution is a mess of global state. Lanugages with working packaging and distribution don't generally bother with anything venv-like.

Groxx · on March 14, 2023

Sure, but that's like 99% pip. Venvs are patching it (quite effectively), not causing it.

oneepic · on March 13, 2023

I think the GP comment might have caused some confusion since it mentioned both package management and venvs very close together.

crabbone · on March 13, 2023

I hate PyPI probably even more than you do, but venv doesn't do that. All it does is write a handful of files and make a bunch of symlinks. It doesn't deal with installation of packages.

aflag · on March 14, 2023

Ok. Fair. Venvs are great, unless you want to install packages on them.

hot_gril · on March 13, 2023

I've never used anything but vanilla Python venvs, and no they don't work reliably. What does is a Docker container. I keep hearing excuses for it, but the prevalence of Dockerfiles in GitHub Python projects says it all. This is somehow way less of an issue in NodeJS, maybe because local environments were always the default way to install things.

neuronexmachina · on March 13, 2023

> This is somehow way less of an issue in NodeJS, maybe because local environments were always the default way to install things.

There's also NodeJS's ability for dependencies to simultaneously use conflicting sub-dependencies.

hot_gril · on March 13, 2023

Yeah, you can't have two deps use different versions of the same sub-dep, cause they flatten everything instead of having a tree. In practice I rarely have issues with this except in React-Native, where it's a common problem, but then again RN is doing some crazy stuff to begin with. Often just force install deps and things work anyway.

Side note, there are way too many React/React-Native "router" type packages, and at least one of them breaks its entire API every update (I think https://reactrouter.com/en/main/upgrading/v5, how are they on version 6 of this). It's so bad that you can't even Google things anymore cause of the naming conflicts.

crabbone · on March 13, 2023

The most important part about venv is that you shouldn't need it. The very fact that it exists is a problem. It is a wrong fix to a problem that was left unfixed because of it.

The problem is fundamental in Python in that its runtime doesn't have a concept of a program or a library or a module (not to be confused with Python's modules, which is a special built-in type) etc. The only thing that exists in Python is a "Python system", i.e. an installation of Python with some packages.

Python systems aren't built to be shared between programs (especially so because it's undefined what a program is in Python), but, by any plausible definition of a program, venv doesn't help to solve the problem. This is also amplified by a bunch of tools that simply ignore venvs existence.

Here are some obvious problems venv doesn't even pretend to solve:

* A Python native module linking with shared objects outside of Python's lib subtree. Most comically, you can accidentally link a python module in one installation of Python with Python from a "wrong" location (and also a wrong version). And then wonder how it works on your computer in your virtual environment, but not on the server.

* venvs provides no compile-time isolation. If you are building native Python modules, you are going to use system-wide installed headers, and pray that your system headers are compatible with the version of Python that's going to load your native modules.

* venv doesn't address PYTHONPATH or any "tricks" various popular libraries (s.a. pytest and setuptools) like to play with the path where Python searches for loadable code. So much so that people using these tools often use them contrary to how they should be used (probably in most cases that's what happens). Ironically, often even the authors of the tools don't understand the adverse effects of how the majority is using their tools in combination with venv.

* It's become a fashion to use venv when distributing Python programs (eg. there are tools that help you build DEB or RPM packages that rely on venv) and of course, a lot of bad things happen because of that. But, really, like I said before: it's not because of venv, it's because venv is the wrong fix for the actual problem. The problem nobody in Python community is bold enough to address.

dkarl · on March 22, 2023

> The most important part about venv is that you shouldn't need it. The very fact that it exists is a problem. It is a wrong fix to a problem that was left unfixed because of it.

What Python needs is a tool that understands your project structure and dependencies so the rest of your tools don't have to.

In other languages, that's called a build tool, which is why people have a hard time understanding that Python needs one.

9dev · on March 13, 2023

Oh, yeah? It’s working great? Like figuring out which packages your application actually uses? Or having separate development and production dependencies? Upgrading outdated libraries?

Having taken a deep-dive into refactoring a large python app, I can confidently say that package management in python is a pain compared to other interpreted languages.

rowanseymour · on March 13, 2023

Virtual environments aren't package management. For example we use Poetry for package management - it supports separate dev and prod dependencies, upgrading etc. It generates a virtual environment.

9dev · on March 13, 2023

The distinction feels entirely academic to me. Managing packages means having a sane way to define the dependencies of software projects, so they can be checked into version control and be installed reproducibly later and/or elsewhere.

I don’t know which problem python intended to solve by separating the two, but it doesn’t occur often in contemporary software engineering work.

Having said that, the point you make is valid and Poetry is a good option, but it feels so maddening having to learn about like seven different tools which all do more or less the same but not quite, and everyone and their mother having an opinion on which is the best. Doesn’t help that there’s an arbitrary yet blurry line where package managers end and environment managers begin.

peterhil · on March 13, 2023

I strongly agree with this, and I have been actively using Python since 2009.

Trying top keep a Pygame/Numpy/Scipy project working has been a real struggle. I started it with Python 2 and ported to Python 3 some years ago. The whole Python 3 transition is a huge mess with every Python 3 point release breaking some things. No other interpreted language’s packaging system is so fucked up.

On a positive note: Lately I've liked using pdm instead of pip, and things seem to work quite a lot better. I evaluated Poetry, Flit and something else also.

I just commented about this on Twitter, when someone asked “Which programming language do you consider beginner's friendly?” https://twitter.com/peterhil/status/1633793218411126789

Jackevansevo · on March 13, 2023

Likewise, I think people have a negative first experience because it doesn't work exactly like node, throw their toys out the pram and complain on HN for the rest of time.

Guess in taking this stance we're both part of the problem... \s

winrid · on March 13, 2023

Because even with --copy it creates all kinds of symlinks, and if you're using pyenv, hard coded paths to the python binary which can break from CI to installation.

If you're using docker then it's a lot easier I guess.

mikepurvis · on March 13, 2023

It also quietly reuses the stdlib of whatever python you start from. Which mostly doesn’t matter in real world usage, but can be quite surprising if you ever get into your head the idea that that venv is portable.

emptysongglass · on March 14, 2023

But why bother? Just use PDM in PEP-582 mode [1] which handles packages the same way as project-scoped Node packages. Virtual environments are just a stop-gap that persisted for long enough for a whole ecosystem of tooling to support for them. It doesn't make them less bad, just less frustrating to deal with.

[1] https://pdm.fming.dev/latest/usage/pep582/

smeagull · on March 14, 2023

My complaints stem from libraries/OSes requiring different tools. So conda is sometimes required, and pip is also sometimes required, and some provide documentation only for pipenv rather than venv. And then you've got Jupyter, which needs to be configured for each environment.

On top of that there are some large libraries that need to only be installed once per system because they're large, which you can do but does mess with dependency resolution, and god help you if you have multiple shadowing versions of the same library installed.

I wish it was simpler. I agree the underlying system is solid, but the fact that it doesn't solve some issues means we have multiple standards layered on top, which is itself a problem.

And great if you've been using vanilla venvs. Good for those that can. If I want hardware support for Apple's hardware I need to use fucking conda. Heaven help me if I want to combine that in a project with something that only uses pip.

nl · on March 14, 2023

I agree with this 100%. Simple venv works reliably.

The only gotcha I've had is to make sure you deactivate and reactivate the virtual environment after installing Jupyter (or iPython). Otherwise (depending on your shell) you might have the executable path to "jupyter" cached by your shell so "jupyter notebook" will be using different dependencies to what you think.

Even comparatively experienced programmers don't see to know this and it causes a lot of subtle programs.

Here's some details on how bash caches paths: https://unix.stackexchange.com/questions/5609/how-do-i-clear...

atoav · on March 14, 2023

I agree with the statement that venvs are usable and fine. However, they do not come without their pitfalls in the greater picture of development and deployment of python software.

It very often not as simple as going to your target system, cloning the repo and running a single line command that gives you the running software. This is what e.g. Rust's cargo would do.

The problem with python venvs is that when problems occur, they require a lot of deep knowledge very fast and that deep knowledge will not be available to the typical beginner. Even I as a decade long python dev will occasionally encounter stuff that takes longer to resolve than needed.

renewiltord · on March 14, 2023

The annoying thing with vanilla venvs (which are principally what I use) is that when I activate a venv, I can no longer `nvim` in that directory because that venv is not going to have `python-neovim` installed. This kind of state leakage is unpleasant for me to work with.

buildbot · on March 13, 2023

I personally hate Conda with a firey passion - it does so much weird magic and ends up breaking things in non obvious ways. Python works best when you keep it really simple. Just a python -m venv per project, a requirements.txt, and you will basically never have issues.

crabbone · on March 13, 2023

I remember when Conda just appeared... I was so high no hopium... well, the word "hopium" didn't exist yet...

Anyways. Today I have to help scientists to deal with it. And... I didn't think it was possible to be worse than pip or other Python tools, but they convinced me it is. Conda is the worst Python program of note that I had ever dealt with. It's so spectacularly bad it's breathtaking. You can almost literally take a random piece of its source code and post it to user boards that make fun of bad code. When I have a bad day, I just open its code in a random file, and like that doctor who was happy running around the imaging room exclaiming "I'm so, so, so happy! I'm so unimaginably happy that I'm not this guy! (pointing at an X-ray in his hand)" I'm happy I'm not the author of this program. I would've just died of shame and depression if I was.

bobbylarrybobby · on March 13, 2023

If it were really that simple, surely all these other solutions wouldn't exist?

Spivak · on March 13, 2023

So there's two notions of simple.

Simple in the sense that it's actually simple, the software you need can be installed with pip install with precompiled binaries for your platform when necessary it supports Python 3.something+, and all it's dependencies are either >= version or the version is >= x.y <= x+1.0

Then there's simple as in the software is actually incredibly useful but is an absolute nightmare of complicated dependency trees where only specific pinned minor versions work together, you need multiple incompatible compiler toolchains and distro packages, it only works if you have CUDA, precompiled binaries exist for some but not all and if you use the precompiled binaries then it changes the dependency story, if you want jupyter support that's a whole different thing AHHHHHHHHHHHHH

In that case some people with more time than sanity said fuck it we'll make it work and conda was born. For me it's a lifesaver when you want to use a piece of software but I wouldn't ever dare deploy production software with it without it being strongly isolated from everything else.

zzzeek · on March 13, 2023

it really is that simple

conda exists because it deals with an entirely different package registry and is a whole distro on its own (I dont know why people need that either, my vague impression is that scence-y types want complete pushbutton installation, typing a command == fail, I guess, I dont know).

poetry exists because it does some kind of automated version management thing (as did pipenv), that I'm sure is nice but not something I've ever needed, but the buzz got out and it took over, people who have never typed "virtualenv" use poetry and they have no idea why.

pletnes · on March 13, 2023

Before conda, you needed a C and a fortran compiler, a BLAS and LAPACK installation, and various build tools, to install scipy. Scipy is one - 1 - dependency used in science and engineering. The pain used to be massive. Now, we’re complaining about the existence of 2 high quality (not perfect) alternatives, and pip (part of the official python distro) can install tensorflow etc, GPU libs included, with a oneliner.

Doxin · on March 14, 2023

These days with wheels there is 0 reason why there can't be binary pip packages for any of those. Definitely used to be a problem though.

jayvius · on March 13, 2023

> (I dont know why people need that either, my vague impression is that scence-y types want complete pushbutton installation, typing a command == fail, I guess, I dont know).

Essentially, yes, and justifiably so. Try installing science-y Python packages on Windows written in C. When conda was created in 2012 this meant installing Visual Studio 2005 (for Python 2.7) which was hard to find on Microsoft's own website even back then.

bombolo · on March 14, 2023

> on Windows

Ah, I found your bug!

squeaky-clean · on March 13, 2023

Conda is very nice for people who just want to write some python and not need to learn about the ecosystem or weird issues like some Visual Studio dependency not being installed to get scipy to compile on Windows or whatever. Like someone coming from R to Python. But aside from that situation I agree with you.

int_19h · on March 14, 2023

conda exists because it can install so much more stuff than pip without hassle, including things that aren't themselves Python packages, but which Python packages need. For example, if you ever tried to build a Python module that has some native code on Windows, you might appreciate this: https://anaconda.org/conda-forge/cxx-compiler.

zzzeek · on March 13, 2023

so there's four responses here and 100% of them refer to a single Python package, Scipy, as the source of all the problems, where they had bad experiences over ten years ago with Python 2.

the Python package index now supports binary wheel files for all platforms and Scipy is there https://pypi.org/project/scipy/ with a few dozen distros.

is the problem solved yet ?

Filligree · on March 13, 2023

You can't install the CUDA base libraries with pip. You can with Conda.

So no, it isn't solved yet.

buildbot · on March 13, 2023

Good? I don't want my package manager messing with my cuda toolkit setup!

tryauuum · on March 13, 2023

? what if you need two different versions of cuda toolkit on same machine?

buildbot · on March 14, 2023

I may be wrong, but I don’t think you would want that - it should match with your driver version. Weird stuff happens with minor version mismatches

Filligree · on March 15, 2023

Probably true, but it usually works anyway. And a lot of ML code depends on exact versions of the CUDA libraries, so you don't get a choice.

teruakohatu · on March 13, 2023

> vague impression is that scence-y types

Because often python is only one part of the puzzle, we also need a variety of third party libraries and maybe R as well (along with R packages). And we would rather not have to explain to another researcher how to install it all.

crabbone · on March 13, 2023

> (I dont know why people need that either, my vague impression is that scence-y types want complete pushbutton installation, typing a command == fail, I guess, I dont know).

It started in a way similar to how Active Perl started. An attempt to build a reliable, curated and "batteries included" solution. (But it went downhill very quickly).

So, what were the problems:

* Most importantly, using Python on Windows is a pain (it still is, but less so). Many Python packages rely on native libraries which they don't provide. Even if they ship with some binary blob, that blob is often just a wrapper to the actual library. But, at the time Conda was created, even a binary blob was not an option. If you wanted to install stuff, you had to compile it. And, compiling on MS Windows is a world of pain. Especially, if you compile something for Python. Again, credit where it's due, Python sort of fixed some of these problems... but just a tiny bit. It still sucks a lot. So, conda is more of a MSYS2 really: it's a Linux user-space in Windows (firstly), and Python on top of that. But, it kind of also doesn't want to admit it, so, it still pretends it's Windows... but with a bunch of Linux-ey stuff...

* Secondly, pip was a joke until about a year ago. It was not really a package manager (it still isn't) and it couldn't properly solve dependencies (it sort of does today, but poorly because of source-distributions). Conda, when it worked, was at least somewhat consistent. (But usually, it just hangs). Also, conda is a package manager (however awful). I.e. with pip you can run install once, and then you can run a different install command again in the same environment, and god help you to make sure you have consistent dependencies between packages. Conda, in principle, should always keep your environment consistent. (But it hangs more often than not). Also, there's no such thing as installing from source with conda. If you want to use source: git clone, install conda-build and suffer through non-existent documentation and get help from non-existent community of people building conda packages.

* Conda provides a curated set of packages (a.k.a default channel). Anyone worth their salt ditches that channel the moment they install conda and installs everything from conda-forge (cuz its got more stuffs!) So, they tried curated. Didn't work.

---

In a sense, I think that what happened is that conda was too ambitious of a project, using wrong technology to implement it. It also didn't have good enough planning for the kind of ambition they had. So, every time they came to solve a real problem, they didn't have time, human resources and system resources to solve it well. They've accumulated technical debt at close-to-light speed and ended up being a huge mess nobody knows how to deal with.

Some of the obvious mistakes would be: creating too many releases of conflicting packages. They should've worked on some sort of LTS solution, where they release a set of packages with very permissive dependencies of very few versions that had been proven to work well together. Instead it's very common for conda packages to be very peculiar (and without real need to do so) about the versions of their dependencies.

Conda people didn't build good CI. They often release absolutely broken packages, and only discover it retrospectively from community input. (Tensorflow would be a great example). This creates a lot of problems with automatic updates.

Conda gave in to community pressure and decided to build integration with pip. It doesn't work well, and it's not possible to make it work well, but they added this and a lot of people instantly created dependencies on this feature.

Conda picked a bone with some of the infra tools outside of Python ecosystem. In particular with CMake. This just adds an extra aspect of pain trying to build conda packages / work with native libraries / wrappers. It might seem like a minor thing, but it prevented a lot of people who might otherwise release packages both for conda and PyPI from doing so. Often package that is released to conda is ported by someone who's not the package author. It also means that sometimes the names of packages differ between conda and PyPI for the same package.

----

NB. In terms of amount of commands you need to type when working with conda vs with PyPI tools is not noticeably different. Conda is, perhaps, more organized, but is also more buggy due to trying to integrate with various shells in special ways, and failing quite a bit.

buildbot · on March 13, 2023

Not saying Conda et. al don't solve problems, especially in specific cases, but they also add them in my opinion.

crabbone · on March 13, 2023

Any program with real world application solves some problem. That's not the point. The point is in how does it do it. And when it comes to conda... it's like MS Outlook on steroids. I cannot really think about a better way to describe it. It's like a movie that can be so bad that it's actually good.

schainks · on March 13, 2023

If software were that simple, surely all these other languages wouldn't exist?

bobbylarrybobby · on March 13, 2023

Right, software is anything but simple

hot_gril · on March 13, 2023

Idk, I still have no idea why yarn exists when there's npm.

insane_dreamer · on March 13, 2023

I'm the opposite. We have to maintain a lot of different environments for different projects, and with conda things "just work" (esp now with better PyCharm integration). Venv is much more of a hassle.

jstx1 · on March 13, 2023

With plain venv it’s hard to maintain multiple different Python versions on the same machine; conda makes this much easier.

Also on M1/M2 Macs some libraries (especially for ML) are only available through conda-forge.

zzzeek · on March 13, 2023

I have like seven pythons on my machine, and I use virtualenv all over, there's no issue, what's the issue? I have to type the path of a specific interpreter when i make a virtualenv, is that the problem? I bet that's the problem

dragonwriter · on March 13, 2023

> With plain venv it’s hard to maintain multiple different Python versions on the same machine

Ironically, given the usual software dev experience on Windows vs. Unixy systems, this is not a problem with the standard install on Windows with the py launcher.

lmm · on March 14, 2023

It's not ironic at all; traditional Linux package management is actualy really bad. On a Linux system you can basically only have one version of anything installed at the same time.

selcuka · on March 14, 2023

I have multiple Python versions on both my Macbook M1 and my Linux laptop, with several dozen venvs. I honestly don't see why others are having so many issues.

Probably bad documentation/tutorials.

lmm · on March 14, 2023

Presumably they're not managed by your system package manager? Or you have one of the rare and relatively new distributions (certainly not typical Linux) that can handle that smoothly, like NixOS?

selcuka · on March 14, 2023

Quite typical: Linux Mint, managed by apt with a third party repository [1].

On Mac I simply use the official installers from python.org.

[1] https://launchpad.net/~deadsnakes/+archive/ubuntu/ppa

buildbot · on March 13, 2023

Is it? I just install different version with brew, and choose which python3.X to create a venv with. The ML packages issue is much more of an issue, but now that the ARM transition is well underway more and more packages work via normal pip.

philsnow · on March 14, 2023

pyenv and asdf both let you manage multiple versions of python on a single machine. They're not great but I wouldn't want to try without one of them.

crabbone · on March 13, 2023

> With plain venv it’s hard to maintain multiple different Python versions on the same machine;

plain venv never advertised itself as a solution to this problem... I don't like this tool, but, sorry to say so, you are barking on the wrong tree.

Also, it's not hard to maintain different versions of Python on the same machine without conda. I can literally do this with my eyes closed w/o touching the mouse: it boils down to:

    cd ~/src/cpython
    git checkout 3.8
    git fetch --all
    git reset --hard origin/3.8
    git clean -Xdf
    ./configure --with-ensurepip=install --with-system-ffi=yes
    make
    sudo make altinstall

Sorry. I've done this from memory and I kept one eye open. So, I kinda lied. But not really.

lmm · on March 14, 2023

> Just a python -m venv per project, a requirements.txt, and you will basically never have issues.

As long as you always remember to run exactly the right two commands every time you work on any project and never run the wrong ones, or run project A's commands in project B's terminal. There's no undo BTW, if you ever do that you've permanently screwed up your setup for project B and there's no way to get back to what you had before (your best bet is destroying and recreating project B's venv, but that will probably leave you with different versions of dependencies installed from what you had before).

(And as others have said, that doesn't cover multiple versions of Python, or native dependencies. But even if your project is pure python with only pure python dependencies that work on all versions, venv+pip is very easy to mess up and impossible to fix when you do)

pbronez · on March 13, 2023

Until you want to use anything with a c extension..

buildbot · on March 13, 2023

I use it all the time for things with c extensions?

wheelerof4te · on March 13, 2023

Why would you want to use Python with a C extension?

If you need performance, just use native code.

peterhil · on March 13, 2023

Numpy and Scipy are good reasons. Unfortunately Scipy does not even compile on FreeBSD lately, and I have opened three issues about it against Scipy and Pythran (and the fix was with xsimd).

https://github.com/serge-sans-paille/pythran/issues/2070

epgui · on March 14, 2023

> Why would you want to use Python with a C extension?

Because if you need performance, you need to use native code.

tomalaci · on March 13, 2023

I would highly recommend Poetry for python package management. It basically wraps around pip and venvs offering a lot of convenience features (managing packages, do dist builds, etc.). It also works pretty nicely with Tox.

I would recommend using virualenvs.in-project setting so Poetry generates venv in the project folder and not in some temporary user folder.

peterhil · on March 13, 2023

I just compared and evaluated Hatch, Flit, Poetry and Pdm and found Pdm to be most robust and slimmest. Hatch was a good second option, and Poetry and Hatch are easy to use, but have too much bloat and magic.

pnt12 · on March 16, 2023

When I tried pdm it wasn't stable yet and messed up my paths.

My experience with poetry has been great. I only disliked that they had auto-update on when locking files, but they changed the default.

davidktr · on March 13, 2023

100% this. I've always struggled with creating packages, but now simply do poetry init and I am done. Magic.

nerdponx · on March 13, 2023

I prefer Hatch over Poetry. I don't have any strong reason for that preference, I've just use both and I feel more comfortable with Hatch. It feels a little more seamlessly integrated with other Python tools, and I appreciate the developers' conservative approach to adding features.

winrid · on March 13, 2023

Thanks. I recently spent a whole afternoon learning how to package a new python project. Was really surprised at the difficulty even with venv, compared to node and java.

Max_Limelihood · on March 13, 2023

Answer: they don’t

(Seriously, I’ve gotten so fed up with Python package management that I just use CondaPkg.jl, which uses Julia’s package manager to take care of Python packages. It is just so much cleaner and easier to use than anything in Python.)

dangerlibrary · on March 13, 2023

I hate python package management - I really do. But I've never actually had a problem with virtual environments, and I think it's because I just use virtualenv directly (rather than conda or whatever else).

I have these aliases in my .bashrc, and I can't remember the last time I had a major issue.

alias venv='rm -rf ./venv && virtualenv venv && source ./venv/bin/activate'

alias vact='source ./venv/bin/activate'

alias pinstall='source ./venv/bin/activate && pip install . && pip install -r ./requirements.txt && pip install ./test_requirements.txt'

I don't have all the fancy features, like automatically activating the virtualenv when I cd into the directory, but I've always found those to be a bigger headache than they are worth. And if I ever run into some incompatibility or duplicate library or something, I blow away the old venv and start fresh. It's a good excuse to get up and make a cup of tea.

Joker_vD · on March 13, 2023

> source ./venv/bin/activate

To this day I'm not quite sure why the venv developers decided that sourcing was a good idea; all it does can be effectively replaced with

    #!/bin/sh
    export VIRTUAL_ENV="path to venv"
    export PATH="$VIRTUAL_ENV/bin:$PATH"
    unset PYTHONHOME
    exec "$SHELL"

Just run this script to get into an "activated" shell. To deactivate, just press Ctrl+D. If you're really fancy, you can replace the last line with

    exec "${@:-$SHELL}"

to run a command directly in the activated environment (and then deactivate it immediately).

ElectricalUnion · on March 14, 2023

This technique is simple, doesn't reinvent the wheel, and even has it's own name: Bernstein chaining.

http://www.catb.org/~esr/writings/taoup/html/ch06s06.html

Izkata · on March 13, 2023

> virtualenv venv

That would be python2, in 3 it's "python -m venv venv" (first venv is package to run, second is directory to put it in)

Otherwise yeah, it's the same and I also use it manually. Never had any problems.

jvolkman · on March 13, 2023

`virtualenv` still exists and is still actively developed. It's true that Python 3 ships with `venv` but I think `virtualenv` offers some additional features.

https://github.com/pypa/virtualenv

kgodey · on March 13, 2023

I use virtualenvwrapper[1] and can't remember any problems with virtual environments either. It sets up human readable aliases for you like "mkvirtualenv" to create a virtualenv and "workon" to activate a virtualenv.

[1] https://github.com/python-virtualenvwrapper/virtualenvwrappe...

danielvaughn · on March 13, 2023

A few weeks ago I spent about a week debugging my Poetry environment. Turns out, their latest release (which I believe was a patch bump!) brought in some breaking changes. And on top of that, a bunch of stuff was forcing python3.11 under the hood, whereas I was on python3.10.

It was a nightmare.

silverwind · on March 13, 2023

Poetry seems to break compatibility with every release of either itself or Python, double the fun.

rewgs · on March 13, 2023

This is similar to what I do, except my "new environment" alias executes a function that takes a python version, installed/specified via pyenv.

Never had a single problem, venv + pyenv is a great combo. As far as I can tell, like so many sources of frustration in tech, the issue typically lies with user error/not fully understanding the tool you're using. That isn't saying that there isn't room for improvement -- most notably, package management in Python flies in the face of "there should be one -- and preferably only one -- obvious way to do it" -- but the tools we have work quite well.

pottertheotter · on March 13, 2023

I use pyenv[1] and the pyenv-virtualenv[2] plugin and I've not had a problem. It's so easy.

[1] https://github.com/pyenv/pyenv [2] https://github.com/pyenv/pyenv-virtualenv

Karrot_Kream · on March 13, 2023

pyenv needs to have its shims in place by running the pyenv init. You can run it when your shell starts but I find it kind of slow and for a while it used to be wonky in fish. But once I run the init, pyenv does work.

That's just for managing your python installation and virtualenv though. You still need to manage your packages and for that you have options like requirements.txt, pipenv (not pyenv lol), Poetry, and others.

spprashant · on March 13, 2023

I might steal these aliases, thank you.

Using virtualenv directly has also been my approach, and has not failed me yet.

I also used Poetry for one of my personal projects, and I liked what I saw.

albert_e · on March 13, 2023

I have struggled with conda and the huge space it usually eats up

I should learn to use venv properly

Thanks

jimnotgym · on March 13, 2023

Agreed. I tried the new package manager combined with venv and using venv directly seems best. A lot faster for a start.

birdstheword5 · on March 13, 2023

It sounds mean to say it, but it's 100% true. I moved away from using python wherever I can. I've had colleagues struggle for days to install well used packages like pandas and numpy in conda.

danielvaughn · on March 13, 2023

I just began writing Python a few months ago. For years prior, I'd been a JS dev, and while NPM can be frustrating at times, I never encountered so many issues as I have in Python. It's crazy.

I'm now curious whether there are languages out there that do have a really nice packaging system.

cstrahan · on March 13, 2023

FWIW, I find Cargo to be one of the biggest reasons I like Rust so much — maybe even more than anything to do with Rust itself or safe code.

I’ll often look for command line tools written in Rust, but not because of Rust fanboyism, but because I know I can just git clone the project and immediately start hacking on a new feature I need or a quick bug fix. In almost every other language I have to jump through one million hoops before I can build and run whatever it is, let alone have a nice developer experience (autocomplete, go to definition, etc).

adgjlsfhk1 · on March 13, 2023

Yeah, one of Julia's best decisions was taking heavy inspiration from Rust for the package manager. Rust was 100% the first language to get dependency management right.

okeuro49 · on March 13, 2023

> Rust was 100% the first language to get dependency management right.

In my experience, Java, Go, PHP, NodeJS have all got similar package management that works.

justinsaccount · on March 13, 2023

See my above comment, there's a reason why all of your examples work:

Java package managers tend to install packages written in java

Go installs packages written in go, and maybe C using cgo

Cargo installs packages written in rust

php package managers install packages written in PHP, extra extensions are rare

etc

People having trouble with python are NOT having trouble with python. They are having trouble because they are trying to use packages that are just python bindings to absolutely massively complex c++ and Fortran libraries.

Often people using python don't even have a C compiler installed (let alone a fortran one for the scientific stuff), so pip blows up the first time they try to install a package that hasn't been pre-built for their system+python version.

fiddlerwoaroof · on March 13, 2023

Yeah, npm was the first good package manager. It gets a lot of hate but my experience is that its strategy is the optimal solution for the problem it solves. And, I think a lot of things people complain about (lots of trivial packages, huge dependency trees, etc.) are an effect of solving the packaging problem well: if you make it easy to add dependencies, people will take advantage of that to add lots of dependencies.

Nadya · on March 13, 2023

Personally, zero complaints about Cargo (Rust) and very minimal complaints about NuGet (C#/.NET). My issues around NuGet are probably self-created because I refuse to learn the CLI [0] for it and I've had occasional issues with Visual Studio's UI for managing things.

https://learn.microsoft.com/en-us/nuget/reference/nuget-exe-...

evntdrvn · on March 13, 2023

In a lot of ways, Paket is significantly better than NuGet, if you ever want to try something new :) It uses a lockfile approach like Cargo, has better dependency resolution, etc https://fsprojects.github.io/Paket/index.html

lmm · on March 14, 2023

The JVM (maven) has quietly had everything working really well for decades. You rarely hear much about it because it just works, and what you hear is mostly people hating on it because it wouldn't let them shoot themselves in the foot. Cargo works much the same way AIUI.

ChancyChance · on March 13, 2023

Same here, I've been using yarn for years, and when I started using venv, I didn't understand why it had to be so complex. Even after reading this article, I still don't see why it is so complex! Yarn/npm has the right idea: dependencies go in the working folder and expect that hierarchy/protocol. Problem solved. The only problem I have with yarn/npm is the problem any package manager has and that is the attrition of dependencies and how to rank their security risk.

hot_gril · on March 13, 2023

I can't think of any package/dep systems I actually like other than npm. And they're even starting to screw that up with the `import` weirdness instead of the `require` that's been so simple and easy.

Rust's system is probably the next best.

ObjC/Swift packaging is a flaming disaster in practice, unless it's improved since I jumped that ship. Last time, I remember every single project having to rely on Cocoapods.

Nadya · on March 13, 2023

Weirdness? `require` was Node weirdness because Javascript lacked imports/exports at the time. The ES6 syntax is remarkably better, allows imports to be async, and doesn't need to run the code to see what should be assigned to `module.exports` allowing it to be statically analyzed which allows tree-shaking. Node's CJS syntax will only work for Node requiring that you transpile and bundle it for browsers. The ES6 syntax will work for Node and the browser.

I see anyone sticking with CJS syntax the same way I see Python devs who continue writing 2.7 code by choice in new projects and not because they are maintaining older projects.

hot_gril · on March 13, 2023

Import is weird because there are several different ways to do the same thing, listed at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Refe... and summarized under "there are four forms of import declarations." Four! And I always forget the different ways they work.

Sure tree-shaking and browser support are nice, but they didn't have to make the syntax this complicated to achieve that. Not an issue with other languages.

Nadya · on March 14, 2023

I'd argue three. Namespace import is really just giving a name to the idea that "you must provide an alias to be used as a namespace if trying to do a wildcard import to avoid naming conflicts" because executing JS in a browser would otherwise not be able to detect naming conflicts - you'd end up overriding values of the same name with the most recent import which is almost never desired behavior. Think about how you might resolve the issue of two different modules exporting a function with the same name otherwise. "Mandatory namespacing" fixes the problem.

The "weird one" of the remaining three is side effect imports which isn't all that weird when you realize you're not assigning it to anything. Functionally this is the same as calling a function rather than assigning a function to a value. eg `myFunction = function myfunction() { //stuff }` vs `myFunction()` and when you think about it like that it becomes significantly less weird but also something you rarely want to do - it's mostly used for polyfills. Good to know it exists but you can probably ignore that it does.

So now you're left with two: Default and Named. Use Default when you want the entire library - or almost the entire library. Used Named when you want specific pieces of the library. That's all there is too it really. If I want a specific function from a library - there's no reason to import the entire library. For a while you'd mix both Default and Named exports for React due to how transpiling worked - this React blog post explains it well: https://reactjs.org/blog/2020/09/22/introducing-the-new-jsx-... but you don't really have a reason to mix the two in modern codebases.

Named imports tend to be preferred because Default imports means giving a name to it which can result in inconsistencies across a codebase when many people are working on it. (eg: `import SumTwo from 'sumtwoNumbers.js'` vs `import AddTwo from 'sumTwoNumbers.js'`. A named import `import {SumTwoNumbers} from 'sumtwoNumbers.js'` solves this problem)

There's still one final little "gotcha" - there can only be a single Default export. Generally it's an object that contains "everything" but it doesn't need to be and those times are the only edge cases you'll run into though I can't say I ever have encountered this so it is a "theoretical" reason to avoid Default imports but I can't say it's ever been an issue in practice.

I guess I avoided a lot of this weirdness by basically only ever using the ES6 syntax and preferring Named imports (and not being stuck in the React ecosystem). The CommonJS got to avoid some of the "weirdness" because it could pretend the Browser doesn't exist (and leave handling it to bundling tools). So I guess I'll capitulate and say it's a little weird but you can basically ignore it and used Named imports as the "One True Style".

hot_gril · on March 14, 2023

Your explanation makes sense, but imo the import syntax shouldn't even require (no pun intended) an explanation.

The bigger thing is, I'm subject to however the deps I use want to export things, so they use a mix of those. Maybe in some cases you have to use `require` even if you don't want to, I forget.

danielvaughn · on March 14, 2023

I remember trying to use cocoapods back in 2015/2016, right around the time that Swift was technically available but not ready for production. I literally gave up trying to import packages, it was a shitshow.

hot_gril · on March 14, 2023

I first used Swift at the same time. Cocoapods actually worked, but only after fighting it all day. Swift was recommended over ObjC, but it was broken. The compiler itself would segfault if I used the wrong map syntax. If the compiler worked, it took about 20X as long as an ObjC build. Core Data managed to produce non-optional objects that were nil in some cases.

Swift got fixed over time (which is why every basic SO question has 20 different answers for each Swift version), but it still sucks, and so does UIKit, and Xcode. That whole toolchain has been relegated to being just a dependency behind React Native for me. I mean look at the shitfest involved just to get a substring https://stackoverflow.com/questions/39677330/how-does-string...

danielvaughn · on March 14, 2023

I try not to hate on projects publicly, because I know a lot of devs smarter than me pour their sweat and tears into these things. But imagine releasing a new language in 2014 and fucking up strings.

hot_gril · on March 14, 2023

Yeah my patience for Apple's native dev environment is down to nothing nowadays. The docs used to explain why not having a string length method is the right move, but an overwhelming amount of "wtf" from users got them to change it finally. At least I'll bash my own work just as much if I think I made a mistake.

dvlsg · on March 13, 2023

Cargo (Rust) is pretty solid. Most of my minor complaints (like being unable to add packages from the CLI) have been resolved with time, as well.

cpach · on March 13, 2023

IMHO, Go’s packaging system is very pleasant to use.

throwawaymaths · on March 13, 2023

Elixir's packaging system is quite good. We went from empty project to working stable diffusion in 2h. 1.75 of those hours was installing CUDA.

justinsaccount · on March 13, 2023

Things like pandas and numpy are not python packages. Yes, they are packages FOR python, but they are not python.

https://hpc.guix.info/blog/2021/09/whats-in-a-package/ does a good job of explaining why installing packages like that is a complete shitshow.

prds_lost · on March 13, 2023

Your colleagues should consider skipping conda and stick to using venv. It will make life much easier. Given pandas/numpy is huge in data science, moving away is not much of an option unless you are working on a personal project or already have a dedicated team comfortable with using a different stack. There is also the Docker option which is great but much more involved.

zelphirkalt · on March 13, 2023

My advice would be to not use Conda or other such "extra tooling". More trouble than benefit. Stick with venv and poetry.

zzzeek · on March 13, 2023

poetry is "extra tooling"

epgui · on March 14, 2023

It's really nice though.

331c8c71 · on March 14, 2023

Until it hangs at dependencies resolution step which happened to me recently on a fastapi/sqlachemy project. Had to add deps one by one not to overwhelm it (rollseyes).

Also doesn't play nice with publishing to custom pypi destinations (e.g. self-hosted Gitlab) in my experience. I could track down the issue but the code around was clearly a mess so that I gave up on that one.

swyx · on March 13, 2023

also there's like 3 different flavors of virtual env now and me being 8 years out of date with my python skillz i have no idea what the current SOTA is with python venv tooling :/

i dont need them demystified, i need someone smarter than me to just tell me what to do lol

dwringer · on March 13, 2023

> and me being 8 years out of date with my python skillz i have no idea what the current SOTA is with python venv tooling :/

It doesn't really matter, by the time you sit down and use it you'll find whatever that is, has also been deprecated and replaced by 2 more.

billforsternz · on March 13, 2023

The problem with software development in 2023 in a nutshell, well played sir.

ollien · on March 13, 2023

The reality is that if you ask 3 different people you're going to get 3 different answers. They're fundamentally the same, just a matter of package management. As far as I'm aware, the current "SOTA" is Poetry. I liked Pipenv for quite some time, but Poetry is just so much faster IME.

gosukiwi · on March 13, 2023

It also makes it very hard for new devs willing to learn Python. Coming from Ruby and JavaScript, you just use bundler or npm, but Python is so strange, even the way it runs files is different, with the module thing.

hot_gril · on March 13, 2023

> i dont need them demystified, i need someone smarter than me to just tell me what to do lol

Dockerfile ;)

cmcconomy · on March 13, 2023

My personal approach is:

- use miniconda ONLY to create a folder structure to store packages and to specify a version of python (3.10 for example)

- use jazzband/pip-tools' "pip-compile" to create a frozen/pinned manifest for all my dependencies

- use pip install to actually install libraries (keeping things stock standard here)

- wrap all the above in a Makefile so I am spared remembering all the esoteric commands I need to pull this all together

in practice, this means once I have a project together I am:

- activating a conda environment

- occasionally using 'make update' from to invoke pip-compile (adding new libraries or upgrading), and

- otherwise using 'make install' to install a known working dependency list.

raihansaputra · on March 14, 2023

thanks for sharing. I've thought about the same approach. Conda installs are.. annoying to say the least, but they do provide a better UX compared to manually managing venvs. Your approach seems mature. (why not ./venv/ per project? because you can't do that when your project directory is on another disk) (also i got burned with poetry in regards of very long dependency checking. I'm not making libraries, just an environment for my own projects)

nose-wuzzy-pad · on March 13, 2023

This seems simplistic and low drag. Do you have an example you can share?

Thanks!

cmcconomy · on March 14, 2023

Sure:

https://gist.github.com/cmcconomy/fa9cad3fda009e522264ea8a21...

https://gist.github.com/cmcconomy/9bca20856a6a48704555bc8dcf...

Hope this helps!

josteink · on March 13, 2023

All other languages: use whatever packages you like. You’ll be fine.

Python: we’re going to force all packages from all projects and repos to be installed in a shared global environment, but since nobody actually wants that we will allow you to circumvent that by creating “virtual” environments you can maintain and have to deal with instead. Also remember to activate it before starting your editor or else lulz. And don’t use the same editor instance for multiple projects. Are you crazy???

Also: Python “just works”, unlike all those other silly languages.

Somebody IMO needs to get off their high horse. I can’t believe Python users are defending this nonsense for real. This must be a severe case of Stockholm-syndrome.

rad_gruchalski · on March 13, 2023

Yeah, how does it go? There’s at least one obvious way to do something? Python… makes me anxious. I don’t mind writing Python but setting it up is crazy.

analog31 · on March 13, 2023

I'm a long time Python user, and I don't defend virtual environments at all. I just don't use them. Granted, I'm a so called "scientific" programmer, and am not writing production code. I haven't run into the problems that are solved by virtual environments, nor have my colleagues. Sure, it means I'm probably living in a bubble, but it may be a bubble shared by a lot of people.

Python is the first language that I've used, where the user community is a major attraction, resulting in significant inertia. Replacing Python requires a new language and a new community. Also, the tools that helped build that community, such as Google and Stackoverflow, have (by some accounts) deteriorated.

If package management is that bad, then yeah, time to switch languages.

No high horse here. Like Sancho Panza, I have to be content with an ass. ;-)

pnt12 · on March 16, 2023

What an awful take.

Python is 30 years old and back then system packages were more common, that's it. It's been a bumpy ride and I'd prefer if there was a standard manager, but the community has produced some decent ones.

The python community has always tried to make devs happy. Pip has an amazing number of libraries. Virtual envs were included into the standard tools. Pipenv was integrated in the python github organization. The org doesn't hate virtual envs!

mixmastamyk · on March 14, 2023

Basically the point is to avoid the system python, which is not hard. One needs some sys-ad skills to understand what is going on however; unfortunately sounds like they are short supply.

I don't do anything you mention, so there must be a simpler way.