>Any flow that does not state checksums/hashsums is not ready for production It'...

zelphirkalt · 2025-02-19T14:07:32 1739974052

> It's not designed nor intended for such. There are tons of Python users out there who have no concept of what you would call "production"; they wrote something that requires NumPy to be installed and they want to communicate this as cleanly and simply (and machine-readably) as possible, so that they can give a single Python file to associates and have them be able to use it in an appropriate environment. It's explicitly designed for users who are not planning to package the code properly in a wheel and put it up on PyPI (or a private index) or anything like that.

Thus my warning about its use. And we as a part of the population need to learn and be educated about dependency management, so that we do not keep running into the same issues over and over again, that come through non-reproducible software.

> Import statements are effectively unrelated to package management. Each installed package ("distribution package") may validly define zero or more top-level names (of "import packages") which don't necessarily bear any relationship to each other, and the `import` syntax can validly import one or more sub-packages and/or attributes of a package or module (a false distinction, anyway; packages are modules), and rename them.

I did not claim them to be related to package management, and I agree. I was making an assertion, trying guess the meaning of what the other poster wrote about some "import bla from blub" statement.

> The `import` syntax serves these tools by telling them about names defined in installed code, yes. The PEP 723 syntax is completely unrelated: it tells different tools (package managers, environment managers and package installers) about names used for installing code.

If you had read my comment a bit more closely, you would have seen, that this is the assertion I made one phrase later.

> It isn't, but it introduces book-keeping (Which venv am I supposed to use for this project? Where is it? Did I put the right things in it already? Should I perhaps remove some stuff from it that I'm no longer using? What will other people need in a venv after I share my code with them?) that some people would prefer to delegate to other tooling.

I understand that. The issue is, that people keep complaining about things that can be solved in rather simple ways. For example:

> Which venv am I supposed to use for this project?

Well, the one in the directory of the project, of course.

> Where is it?

In the project directory of course.

> Did I put the right things in it already?

If it exists, it should have the dependencies installed. If you change the dependencies, then update the venv right away. You are always in a valid state this way. Simple.

> Should I perhaps remove some stuff from it that I'm no longer using?

That is done in the "update the venv" step mentioned above. Whether you delete the venv and re-create it, or have a dependency managing tool, that removes unused dependencies, I don't care, but you will know it, when you use such a tool. If you don't use such a tool, just recreate the venv. Nothing complicated so far.

> What will other people need in a venv after I share my code with them?

One does not share a venv itself, one shares the reproducible way to recreate it on another machine. Thus others will have just what you have, once they create the same venv. Reproducibility is key, if you want your code to run elsewhere reliably.

All of those have rather simple answers. I grant, some of these answers one learns over time, when dealing with these questions many times. However, none of it must be made difficult.

> I'd like to agree, but that just isn't going to accommodate the entire existing communities of people writing random 100-line analysis scripts with Pandas.

True, but those have apparently a need to have Pandas. Then it cannot be avoided to install dependencies. Then it depends on whether their stuff is one-off stuff, that no one will ever need to run again later, or part of some need to be reliable pipeline. The use-case changes the requirements with regard to reproducibility.

---

About the NPM - PIP comparison. Sure there may be differences. None of those however justify not having hashsums of dependencies where they can be had. And if there is a C thing? Well, you will still download that in some tarball or archive when you install it as a dependency. Easy to get a checksum of that. Store the checksum.

I was merely pointing out a basic facility of NPM, that is there for as long as I remember using NPM, that is still not existent with PIP, except for using some additional packages to facilitate it (I think hashtools or something like that was required). I am not holding up NPM as the shining star, that we all should follow. It has its own ugly corners. I was pointing out that specific aspect of dependency management. Any artifact downloaded from anywhere one can calculate the hashes of. There are no excuses for not having the hashes of artifacts.

That Pip is 19 years older than NPM doesn't have to be a negative. Those are 19 years more time to have worked on the issues as well. In those 19 years no one had issues with non-reproducible builds? I find that hard to believe. If anything the many people complaining about not being able to install some dependency in some scenario tell us, that reproducible builds are key, to avoid these issues.

zahlman · 2025-02-19T20:40:40 1739997640

>I did not claim them to be related to package management, and I agree.

Sure, but TFA is about installation, and I wanted to make sure we're all on the same page.

>I understand that. The issue is, that people keep complaining about things that can be solved in rather simple ways.

Can be. But there are many comparably simple ways, none of which is obvious. For example, using the most basic level of tooling, I put my venvs within a `.local` directory` which contains other things I don't want to put in my repo nor mention in .gitignore. Other workflow managers put them in an entirely separate directory and maintain their own mapping.

>Whether you delete the venv and re-create it, or have a dependency managing tool, that removes unused dependencies, I don't care, but you will know it, when you use such a tool.

Well, yes. That's the entire point. When people are accustomed to using a single venv, it's because they haven't previously seen the point of separating things out. When they realize the error of their ways, they may "prefer to delegate to other tooling", as I said. Because it represents a pretty radical change to their workflow.

> That Pip is 19 years older than NPM doesn't have to be a negative. Those are 19 years more time to have worked on the issues as well.

In those 19 years people worked out ways to use Python and share code that bear no resemblance to anything that people mean today when they use the term "ecosystem". And they will be very upset if they're forced to adapt. Reading the Packaging section of the Python Discourse forum (https://discuss.python.org/c/packaging/14) is enlightening in this regard.

> In those 19 years no one had issues with non-reproducible builds?

Of course they have. That's one of the reasons why uv is the N+1th competitor in its niche; why Conda exists; why meson-python (https://mesonbuild.com/meson-python/index.html) exists; why https://pypackaging-native.github.io/ exists; etc. Pip isn't in a position to solve these kinds of problems because of a) the core Python team's attitude towards packaging; b) Pip's intended and declared scope; and c) the sheer range of needs of the entire Python community. (Pip doesn't actually even do builds; it delegates to whichever build backend is declared in the project metadata, defaulting to Setuptools.)

But it sounds more like you're talking about lockfiles with hashes. In which case, please just see https://peps.python.org/pep-0751/ and the corresponding discussion ("Post-History" links there).