You would have to lock the file, or guarantee consistency in some other way. Right now, I don't believe Linux does anything about consistency of reads / writes to that file... which is bad, but we pretend not to notice.
So... the system is kind of broken to begin with, and it's kind of pointless to try to assess its performance.
Also, it would obviously make a lot of difference if you had a hundred of users with only a handful being active users, or if you had a hundred of active users. I meant active users. Running programs all the time.
NB. You might have heard about this language called Python. Upon starting the interpreter it reads /etc/passwd (because it needs to populate some "static" data in os module). Bet a bunch of similar tools do the same thing. If you have a bunch of users all running Python scripts while there are some changes to the user directory... things are going to get interesting.
> You would have to lock the file, or guarantee consistency in some other way.
I think the standard approach to atomicity is to copy, change the copy, then move that copy overwriting the original (edit: file moves are sorta atomic). Not perfect but generally works.
I agree that this approach is not good for a users directory, I'm just disagreeing that the reason it's not good is performance-related.
Moves are atomic. During the move, at no time is it possible to get the contents of file 1 and file 2 confused when reading from the file descriptors. (Confusion by the human operating things is eminently possible.)
Most systems come with "vipw" which does the atomic-rename dance to avoid problems with /etc/password. In practice this works fine. Things get more complicated when you have alternate PAM arrangements.
A whole bunch of standard functions like getpwents() are defined to read /etc/password, so that can't be changed.
`getpwents()` is not defined to only read `/etc/passwd`. There is only a requirement that there is some abstract "user database" or "password database" (depending on if you're reading the linux man pages or the Single Unix Specification man pages).
In practice, `getpwent` on linux uses the nsswitch mechanism to provide return values from `getpwent`. One can probably disable using `/etc/passwd` entirely when using glibc if: all users do use `getpwent`, and you remove `files` from the `passwd` entry in `/etc/nsswitch.conf`.
I had >30k users on a bunch of systems in 2001 (I inherited that approach, mind, I'm not -recommending- it).
We moved to LDAP a couple years later because it was a much nicer architecture to deal with overall, but performance and consistency weren't problems in practice.
So... the system is kind of broken to begin with, and it's kind of pointless to try to assess its performance.
Also, it would obviously make a lot of difference if you had a hundred of users with only a handful being active users, or if you had a hundred of active users. I meant active users. Running programs all the time.
NB. You might have heard about this language called Python. Upon starting the interpreter it reads /etc/passwd (because it needs to populate some "static" data in os module). Bet a bunch of similar tools do the same thing. If you have a bunch of users all running Python scripts while there are some changes to the user directory... things are going to get interesting.