I've been using Python since like 2006, so maybe I just have that generational k...

johtso · on March 14, 2023

Isn't pip + requirements.txt insufficient for repeatable deployments? You need to pin all dependencies not just your immediate project dependencies, unless you want some random downstream update to break your build. I guess you can do that by hand.. but don't you kind of need some kind of a lock file to stay safe/sane?

oblvious-earth · on March 14, 2023

The simple solution to this with pip is constraints file:

    pip install <my_entire_universe_of_requirements>
    pip freeze > constraints.txt

And now in any new environment:

    pip install <any_subset_of_requirements> -c constraints.txt

Now you can install prod requirements or dev requirements or whatever other combination of requirements you have and you guarantee to have the exact same subset of packages, no matter what your transitive dependencies are doing.

You can use pip-compile from pip-tools if you want the file to include exact hashes.

MrJohz · on March 14, 2023

This is true, but now you're explicitly depending on all of your transitive dependencies, which makes updating the project a lot harder. For example, if a dependency stops pulling in a transitive dependency past a certain version, you'll need to either recreate the constraints file by reinstalling everything, or manually remove the dependencies you don't need any more.

Also pip freeze does not emit a constraints file, it emits (mostly) a requirements file. This distinction is rarely important, but when it is, it can cause a lot of problems with this workflow. For example, a constraints file cannot include any information about which extras are installed, which pip freeze does by default. It also can't contain local or file dependencies, so if you have multiple projects that you're developing together it simply won't work. You also can't have installed the current project in editable mode if you want the simple "pip freeze" workflow to work correctly (although in practice that's not so difficult to work around).

Pip-tools does work a bit better, although the last time I used it, it considered the dependency chains for production and for development in isolation, which meant it would install different versions of some packages in production than in development (which was one of the big problems I was trying to solve).

From my experience trying basically every single option in the packaging ecosystem, there aren't really any solutions here. Even Poetry, which is pretty much best-in-class for actually managing dependencies, struggles with workspace-like installations and more complicated build scripts. Which is why I think pretty much every project seems to have its own, subtly unique build/dependency system.

Compare and contrast this with, say, NPM or Cargo, which in 95% of cases just do exactly what you need them to do, correctly, safely, and without having to think about it at all.

oblvious-earth · on March 16, 2023

> This is true, but now you're explicitly depending on all of your transitive dependencies

They're constraints not dependencies they don't need to be installed and you can just update your requirements as you need and regenerate them.

> Also pip freeze does not emit a constraints file, it emits (mostly) a requirements file. This distinction is rarely important, but when it is, it can cause a lot of problems with this workflow. For example, a constraints file cannot include any information about which extras are installed, which pip freeze does by default

Pip freeze does not use extras notation, you just get extra packages listed as individual dependencies. Yes there is an important distinction between constraints and requirements but Pip freeze uses an intersecting subset of the notation.

> You also can't have installed the current project in editable mode if you want the simple "pip freeze" workflow to work correctly

That's why the workflow I gave to generate the constraints didn't use the -e flag, you generate the constraints separately and then can install however you want, editable or not.

> From my experience trying basically every single option in the packaging ecosystem, there aren't really any solutions here. Even Poetry, which is pretty much best-in-class for actually managing dependencies, struggles with workspace-like installations and more complicated build scripts. Which is why I think pretty much every project seems to have its own, subtly unique build/dependency system.

People have subtly different use cases that make a big impact on what option is best for them. But I've never been able to fit Poetry into any of my use cases completely, whereas a small shell script to generate constraints automatically out of my requirements has worked exceedingly well for pretty much every use case I've encountered.

bozobuttz · on March 14, 2023

I have been using pip since 2014 and did not know about the constraints. This solves my issue with sub-dependency version pinning!

SnowflakeOnIce · on March 14, 2023

'pip freeze' will generate the requirements.txt for you, including all those transitive dependencies.

It's still not great though, since that only pins version numbers, and not hashes.

You probably don't want to manually generate requirements.txt. Instead, list your project's immediate dependencies in the setup.cfg/setup.py file, install that in a venv, and then 'pip freeze' to get a requirements.txt file. To recreate this in a new system, create a venv there, and then 'pip install -c requirements.txt YOUR_PACKAGE'.

The whole thing is pretty finicky.

selcuka · on March 14, 2023

Not sure why repeatable deployments would be a problem. You can pin all dependencies by issuing a

    pip freeze > requirements.txt

if you want. The only catch is you should be using a similar architecture and Python version in both development and production.

This would also pin a few non-project dependencies such as `disttools` but that shouldn't be a problem.

Edit: TIL that pip constraints is a thing. See the comment posted by oblvious-earth for an even better approach.

coldtea · on March 13, 2023

Is it "generational knowledge and battlefront experience" or just "getting used to the (shitty) way things have always been" and Stockhold Syndrome?

chpatrick · on March 13, 2023

It was pretty bad before but now it seems like there are a bunch of competing solutions each with their own quirks and problems. It feels like the JavaScript ecosystem.

MrJohz · on March 14, 2023

Ironically, the Javascript ecosystem is far better than the Python ecosystem when it comes to packaging and dependencies. NPM just does the right thing by default: you define dependencies in one place, and they are automatically fixed unless you choose to update them. Combine that with stuff like workspaces and scripts, and you basically have everything you need for the vast majority of use cases.

Yes, there's also other options like Yarn, which have typically had newer features and different approaches, but pretty much everything that works has been folded back into NPM itself. Unless you really want to live at the bleeding edge for some reason, NPM is perfectly sufficient for all your needs.

In contrast, the closest thing to that in the Python ecosystem is Poetry, which does a lot of things right, but is not supported by Python maintainers, and is still missing a handful of things here and there.

I'm not saying the JS ecosystem as a whole is perfect, but for packaging specifically, it's a lot better than Python.

bombolo · on March 14, 2023

> they are automatically fixed unless you choose to update them

That's a good way to never get vulnerabilities fixed.

It hardly seems like "the right thing" to me.

MrJohz · on March 14, 2023

I mean, a project needs regular care and maintenance, however you organise it. If you're never scheduling time to maintain your dependencies, you're going to be in trouble either way. But at least if you lock your dependencies, you know what will actually get installed, and you can find the buggy or insecure versions.

We found a bug on a Python project I worked on recently that only seemed to happen on certain machines. We couldn't reproduce it in a dev environment, and one machine the was affected suddenly stopped being affected after a while. It turns out the issue was a buggy dependency: one particular build of the project happened to have picked up the buggy version, but later builds used the fixed version and so didn't have a problem. So we'd only see the bug depending on which build the machine had last used, and if someone put a different build on there, it would reset that completely. On our development machines, we used slightly different builds that just happened but to have been affected.

Pinning dependencies wouldn't necessarily have prevented the bug in the first place - sometimes you just have buggy dependencies - but the debugging process would have gone much more quickly and smoothly with a consistent build environment. We could also have been much more confident that the bug wouldn't accidentally come back.

bombolo · on March 14, 2023

You should really start using linux distributions. These problems are all solved and have been solved for a long time.

MrJohz · on March 14, 2023

That's definitely a solution, but it comes with its own problems, in particular that you add a significant dependency on what is essentially a middleman organisation trying to manage all possible dependencies. This doesn't scale very well, particularly because there's a kind of M×N problem where M packages can each have N versions which can be depended on. In practice, most distros tend to only support one version of each package, which makes the job easier for the distro maintainer, but makes things harder for everyone else (library authors get bug reports for problems they've already fixed, end users have less ability to choose the versions they need, etc).

In particular, it also makes upgrading a much more complex task. For example, React releases new major versions on a semi regular basis, each one containing some breaking changes, but not many. Ideally there wouldn't be any, but breaking changes are inevitable with any tool as situations change and the problem space becomes better understood. But because the NPM ecosystem generally uses locked dependency lists, end users can upgrade at their leisure, either with small changes every so often, or only upgrading when there's a good reason to do so. Both sides can be fairly flexible in how they do things without worrying about breaking something accidentally.

Under a Linux distribution model however, those incremental breaking changes become essentially impossible. But that means that either projects accumulate cruft that can't ever be removed and makes maintainers' and users' lives more complex, or projects have to do occasional "break everything" releases a là Python 2/3 in order to regain order, which is also more work for everyone. There is a lot less flexibility on offer here.

I don't think these sorts of problems disqualify the Linux distribution model entirely - it does do a lot of things well, particularly when it comes to security and long-term care. But there's a set of tradeoffs at play here, and personally I'd rather accept more responsibility for the dependencies that I use, in exchange for having more flexibility in how I use them. And given the popularity of language-specific package repositories that work this way, I get the feeling that this is a pretty common sentiment.

ElectricalUnion · on March 14, 2023

What happens when your distribution only have old versions, or worse, no versions of the libraries you need? You hoop distribution? You layer another distribution like Nix or Anaconda over your base distribution? You give up and bundle another entire distribution in a container image?

bombolo · on March 14, 2023

You make a package for the thing you need.

ElectricalUnion · on March 16, 2023

So the "solution to packages" is to make your own package with someone's else package?

If it's that simple, how come no one already did all that work for us?

coldtea · on March 14, 2023

It's 200% is "the right thing".

Updating packages should be strictly left to the developer's discretion. That schedule is up to the developer using the packages, not upstream.

Not to mention that dependencies updating themselves whenever they like to "fix vulnerabilities" is a sure-fire way to break your program and introduce breakage and vulnerabilities in behavior...

coldtea · on March 14, 2023

The Javascript ecosystem for other things, like frameworks, sure.

But when it comes to packages and "virtual envs" the Javascript ecosystem is leaps and bounds better.

ElectricalUnion · on March 14, 2023

The "Javascript ecosystem" on my personal experience seems to prefeer installing everything in the global environment "for ease of use convenience" and then they wonder how did a random deprecated and vulnerable dependency get inside their sometimes flattened, sometimes nested, non-deterministic dependency chain (I wish the deterministic nested pnpm was the standard...) and (pretend) they did not notice.

That being said, the Javascript ecosystem has standarized tooling to handle that (npx) that Python doesn't (I wish pipx was part of standard pip), they just pick the convenient footgun approach.

ospider · on March 14, 2023

I don't think so. Python is battery included, and most packages in the Python ecosystem are not as scattered as npm packages. The number of packages in a typical Python project is much smaller than a Nodejs project. I think that's the reason why people are still happy with simple tools like pip and requirements.txt.

coldtea · on March 14, 2023

People are happy?

It's one of the major sources of disatisfaction with Python!

Doxin · on March 14, 2023

or the third option: did the whole packaging nonsense actually get kinda alright lately?

coldtea · on March 14, 2023

There's a PEP to get a part of it right [1] - at least the installation of dependencies and the need for virtualenv side, but atm the packaging nonsense is still as bad as it always has been.

https://peps.python.org/pep-0582/

Sample comment from its discussion:

>> Are pip maintainers on board with this? > Personally, no. I like the idea in principle, but in practice, as you say, it seems like a pretty major change in behaviour and something I’d expect to be thrashed out in far more detail before assuming it’ll “just happen”.

As if the several half-arsed official solutions already existing around packaging (the several ways to build and create packages) had deep thinking and design behind them...

anongraddebt · on March 13, 2023

Twice bricking my laptop’s ability to do python development because of venv + symlink bs was the catalyst I needed to go all-in on remote dev environments.

I don’t drive python daily, but my other projects thank Python for that.

kermatt · on March 14, 2023

How do you brick a machine with venvs?

bombolo · on March 14, 2023

He runs pip as root and doesn't use venvs.

anongraddebt · on March 14, 2023

By debugging homebrew issues during Moneterey updates.

I didn’t brick the machine, just the ability to setup a typical python venv.

mixmastamyk · on March 14, 2023

System administration skills are necessary to be productive developer. There is nothing re: python that can't be fixed with a few shell commands.

ElectricalUnion · on March 14, 2023

If a rogue package rm's your root directory as root, you need a bit more that a few shell commands to fix it.

mixmastamyk · on March 14, 2023

Can't happen unless you install as root. You're NOT DOING THAT are you?

Also LiveCDs are a thing for about twenty years. Recovery has never been easier, even after hardware failure.

ElectricalUnion · on March 16, 2023

It doesn't even need root to cause damage most of the time, it just needs to overwrite all files under your user by mistake.

> Recovery has never been easier, even after hardware failure.

If you can use a LiveCD to repair it, it most likely wasn't a hardware failure to start.

cwsx · on March 14, 2023

I've managed to break venv, npm and composer (php).

I don't use that as a reason to choose what I'll use in my projects, that's decided by the PTSD incurred from 7 years of php.

benhurmarcel · on March 14, 2023

It's really inconvenient for simple use cases. You don't even get a command to update all packages.

crabbone · on March 13, 2023

Lol. You put "simple" and "requirements.txt" unironically next to each other...

I mean, I think you genuinely believe that what you suggest is simple... so, I won't pretend to not understand how you might think that. I'll explain:

There's simplicity in performing and simplicity of understanding the process. It's simple to make more humans, it's very hard to understand how humans work. When you think about using pip with requirements.txt you are doing the simple to perform part, but you have no idea what stands behind that.

Unfortunately for you, what stands behind that is ugly and not at all simple. Well, you may say that sometimes it's necessary... but, in this case it's not. It's a product of multiple subsequent failures of people working on this system. Series of mistakes, misunderstandings, bad designs which set in motions processes that in retrospect became impossible to revert.

There aren't good ways to use Python, but even with what we have today, pip + requirements.txt is not anywhere near the best you can do, if you want simplicity. Do you want to know what's actually simple? Here:

Store links to Wheels of your dependencies in a file. You can even call it requirements.txt if you so want. Use curl or equivalent to download those wheels and extract them into what Python calls "platlib" (finding it is left as an exercise for the reader) removing everything in scripts and data catalogues. If you feel adventurous, you can put scripts into the same directory where Python binary is installed, but I wouldn't do that if I were you.

Years of being in infra roles taught me that this is the most reliable way to have nightly builds running quietly and avoiding various "infra failures" due to how poorly Python infra tools behave.

akprasad · on March 13, 2023

What are specific problems you have with pip + requirements.txt, and why do you believe storing links to wheels is more reliable? Your comment makes your conclusion clear, but I don't follow your argument.

crabbone · on March 14, 2023

Pip is a huge and convoluted program with tons of bugs. It does a lot more than just download Python packages and unpack them into their destination. Obviously, if you want something simple, then HTTP client, which constitutes only a tiny fraction of pip would be a simpler solution, wouldn't it?

In practice, pip may not honor your requirements.txt the way you think it would. Even if you require exact versions of packages (which is something you shouldn't do for programs / libraries). This is because pip will install something first, with its dependencies, and then move to the next item, and then this item may or may not match what was already installed.

The reason you don't run into situations like this one often enough to be upset is because a lot of Python projects don't survive for very long. They become broken beyond repair after few years of no maintenance. Where by maintenance I mean constant chasing of the most recent set of dependencies. Once you try to install and older project using pip and requirements.txt, it's going to explode...