PHP has had a dedicated password hashing function since version 5.5: it's password_hash().
But at least, if you want to roll your own password hashing, do it right: use hmac, salting, stretching and a decent hash function - or use bcrypt/scrypt and save yourself a few headaches.
I'm well aware of that. The original comment suggested that password hashing is bad, I was trying to identify what the poster thought should be used instead.
I assume they meant the use of a cryptographic hash function (which is wrong) instead of a password hash function (which, as the name implies, is suitable for password storage).
For example, I see a lot of this construction in proprietary code:
Yes. Even in places where people use prepared statement and things like LINQ to build queries, somewhere someone will end up wanting or needing a query that is hard to express so they will start string building a raw SQL query and forget or not know to use prepared statements. We see this /all/ the time on our assessment of our customer's software.
It is markedly less common, but still quite common, if not pervasive like it used to be. PHP apps are dramatically more likely to have security vulnerabilities of any sort. It is a running joke we have that one of our recommendations to remediate security issues with PHP is to "rewrite in a different language". For whatever reason PHP has a history of encouraging vulnerable code. Even in good shops. PHP itself isn't really to blame (it is, of course much less likely for an experienced shop to write crappy PHP... but they probably wouldn't pick PHP to begin with).
What deserves a lot of the blame is the awful mysql_* API, which didn't support query placeholders. (The mysqli and PDO APIs weren't available until much later, and weren't reliably available on many web hosts for even longer.) This forced PHP developers to become comfortable with constructing SQL queries as strings -- a habit which proved hard to break for many of them.
Salting is incredibly effective. Without salting, the entire power of the attacker can be used against all password hashes at once. With salting, those millions of hashes per second translates to only being able to crack simple passwords, or only the passwords of a few users.
MD5 is not that weak, it's just fast, so you have to use a few rounds of it. The preimage vulnerability reduced the search space from 128 bits to 124 [1], which is still extremely secure. People who claim it's broken are probably thinking about the collision vulnerability, which doesn't help you at all to recover the password from a hash.
The problem is not that hard passwords are easy to crack, it's that easy passwords are stupid easy to crack.
An attacker doesn't care about your password so much as they care about getting 50-80% of the passwords on the system, the majority of which are going to be "password123" or some other garbage you can find in a dictionary attack. If you use something like Bcrypt or Scrypt then running your dictionary cracker becomes a real ordeal, it will take a long time to grind through those entries and find matches. With MD5 it's a joke, you can do most of the common ones in seconds and the trickier variants within hours.
That's why I mentioned "a few rounds of it". Use 1000 if you're paranoid, you can make it as computationally complex as BCrypt. And, of course, you should use salts. Good luck breaking that. You couldn't even break my double MD5.
Another thing you should do is not allow common or short passwords. That includes "password123". That alone will immediately elevate your security.
> Additionally, if I can find a bit of text that hashes the same I don't care what the original password was. I have a key that works
And you clearly don't understand how the collusion vulnerability works. You can't create one with a given hash. You carefully craft two documents that hash to the same hash, which you don't know in advance. Again, you can prove me wrong by doing just that to the hash in the code snippet.
FFS, stop making cryptographic recommendations to others when you don't have expertise in this area.
Cryptographers nowadays are recommending PKBDF2 (essentially iterated hashing with the hash algorithm of your choice) with at least 100,000 rounds. And it's considered to by far the weakest of "modern" password hashing approaches, behind bcrypt, scrypt, and Argon2 (in that order).
Your double-MD5 is garbage, and nobody is going to bother wasting their time breaking it because it's a bullshit challenge. If you used a strong, unique password, you've missed the point, because users in the real world don't. If you used a weak password, we have a few billion points of empirical data that contradict you, so why bother installing hashcat and mucking about with password cracking rules when anyone paying attention for the past ten years knows what the outcome is going to be?
MD5 is broken for passwords because it is too fast. Sure, if you use PBKDF2-MD5 with a random salt and 100k iterations you're going to be fine but that's not what anyone in this thread, including you, has been talking about.
You started off saying MD5 is fine, then backed off to saying it's fine with a salt, then fine with two rounds, then suggested 1,000 is okay, now we're at 100,000. At what point do you stop making excuses and acknowledge that your original advice was garbage, and backpedaling repeatedly is not helping your case?
Further, it is impossible to categorically prevent weak passwords. You can impose length restrictions. You can disallow common passwords. You can require special characters. But people's ability to come up with weak, pattern-based passwords fundamentally outperforms our ability to stop them, and at some point restrictions become so burdensome that people stop signing up for your app altogether.
"Just block weak passwords" is ridiculous on its face and you either know it and are arguing simply to save face, or you don't know it and are infuriatingly overconfident for your level of incompetence. I'm past the point of caring which.
A thousand rounds of MD5 is absolute insanity, I mean that's the sort of thing that gets you banned from ever writing security critical code ever again. The cost of an MD5 cycle is basically zero. Zero times a thousand is still zero. Zero times a million is zero. It's got zero value in terms of protection.
Even SHA1, which is a much harder hash to compute, can be processed at trillions of hashes per second on some hardware. If you did a million cycles of SHA1, guess what? That thing can still crack a million guesses per second.
Bcrypt has a difficulty number you can pin at a level that's uncomfortably high. It may take 200ms to verify a password if you really crank it. That means you can't do thousands of hashes per second, but you're stuck doing maybe a thousand hashes per minute. You can't dictionary attack that.
I know how the collision vulnerability works. People have been creating MD5 collisions for fun for a while now, unconcerned with the original hash. If they focused their effort on matching hashes, they could probably do it by exploiting fundamental weaknesses in the MD5 hash system itself.
At this point a sufficiently robust SAT solver can probably crack it.
All your hand-waving about disallowing easy passwords doesn't matter. If you add restrictions, people find ways around them. It also doesn't matter if you make passwords incrementally harder if someone can crack them easily. Their computer doesn't care if they set it to search ten characters instead of eight, modern crackers are pretty efficient to fairly intimidating lengths.
> The cost of an MD5 cycle is basically zero. Zero times a thousand is still zero. Zero times a million is zero.
I don't think you have any clue what you're talking about. Do a million sequential rounds of MD5 on your CPU and measure how long it takes. I guarantee you, it's not zero. Not even close.
GPU accelerated MD5 is a thing and I assure you it's a lot closer to zero than you'd think.
High-end FPGAs can smash through SHA1 at rates of billions per second and MD5 is not even that complicated. A million rounds is not as hard as Bcrypt cranked up to a sufficiently robust level.
While I don't want to downplay his role in mobilizing the community, it's hardly single-handed. Implementing ext/sodium was a group effort by dozens of developers, reviewers, and testers.
Libsodium was Frank Denis's project, which was spawned by NaCl by cryptographers Dan Bernstein, Tanja Lange, and Peter Schwabe.
The participants who voted on the RFC, for the most part, were involved in the technical discussions over the past two years since I first mentioned the notion of doing so (before the PHP 7.0 release).
Similarly, there were 13 people who contributed to the libsodium-php repository (ext/libsodium in PECL). Every single one of them had to consent to relicensing the extension to easily get merged into PHP, and we all did.
I often joke that I'm the worst C developer in all of infosec, but there's some truth to that when it comes to modifying the PHP core.
My main role in the ext/sodium project was saying, "We should do this," and somehow getting people to listen.
Very fair. I don't mean to minimize the efforts of others and the community. I should say as an outsider with a long-held aversion to anything PHP-related, Scott has single-handedly changed my view of the language to a more neutral position.
I'm not entirely sure I believe that. I saw a Paragonie project that was posted a while back, supposedly a "secure by default" CMS. I looked at the code and saw it was an hard to audit mess that decided to implement everything itself. It's own ORM, own router, own MVC framework. I spent an hour and found numerous issues[1] in the code.
I don't believe re-implementing the wheel like that is a 'modern security practice'. I asked why they did that instead of using robust, well tested and well supported libraries and did not get a satisfactory answer.
While it's not the author's project (I think?) he is still part of the company, and I'd hope such a security-focused organization wouldn't have done something like that.
This is off topic and reads a bit like a personal attack, but if you want to have a level-headed discussion about Airship's design and implementation, https://github.com/paragonie/airship/issues
A lot has changed since you last looked at it, and a lot will change before the v2.0.0 rewrite is complete.
I'm sorry that you read it as a personal attack, I did not mean for it to read that way. I was merely pointing out something that I consider to be designed contrary to security best practices in response to a comment about security best practices.
I'm glad to hear things have changed in the code however, and wish you the best with the rewrite.
You replied to a comment praising Scott for improving PHP security by saying "I'm not sure I believe that", and then ambiguously attacked his code. Who are you trying to kid?
I don't much care one way or the other about whether it's OK to try to take people down a peg, but I do care very much when people do that and then try to get away with pretending that's not what they're doing.
Meh. Even smart people do dumb things with languages other than PHP.
Any language with string interpolation can create sql injection vulnerabilities, for example. PHP's docs encourage the right pattern, as do Ruby's, but... https://github.com/gitlabhq/gitlabhq/issues/2464
There are best practices for handling dynamite, too, but I wouldn't build a safe with it.
Nobody should be writing code that queries databases using tainted strings. Perl has had a taint mode for almost 30 years. Ruby supposedly has a taint mode, but I'm not a Ruby dev so I don't know how it affects the vuln you mention.
....so enable Taint mode and fix the errors? And why would you think your code is vulnerable to SQLi and not RCE, or any other user input-related exploit?
People have created secure systems in far worse languages, and people have written laughably insecure code any language you care to name.
The “problem” with PHP is its ease of use, allowing people who don’t know about good development practices to quickly write their first web application. But with discipline, PHP is a perfectly acceptable language.
As someone who has also been working with PHP since early 4.x, I find I have a lot of sympathy for the haters. It has improved a lot, but there's a lot to hate.
There's clear evidence that php, both the language and popular libraries, have a much worse security history compared to other language even after weighting the level and number of vulnerabilities against language popularity and amount of code.
I don't know. It seems the number of issues in each of those products more or less reflects their size. The fact that PHP comes with a huge builtin function list thanks to bundled extensions is IMHO penalizing it. It would be interesting to match the number of issues in each product with the number of SLOC - possibly excluding comments - I'm sure we'd have a more level playground (assuming they're all mostly C/C++).
PHP's choose-your-own-error-adventure with all errors on (which is the only good choice) is pretty silly. arrays become somewhat unwieldy because the core libs for arrays are deficient, missing things like a get_default($array, $key, $default).
The array core lib in general is troublesome to understand/intuit different pieces for, probably because mashing hash types and lists together leads to all sorts of ambiguity (like ambiguous json decoding/encoding). Want some subset of an array? Sure, that's the super-intuitive `array_intersect_key($array, array_flip($keys));`
The core array type is really a frustrating experience in PHP, and interop deficiencies with ArrayObject make it unusable as an alternative.
No notices - But this syntax has been introduced in 7.0.
Old style is:
isset($array[$key]) ? $array[$key] : $default
Which is, admittedly, just a bit uglier.
Edit: isset() is a language construct, not a function call. So it does not incur performance penalties. The ?? operator was introduced for 'cosmetic' reasons only, so don't rush to update your source...
Truthiness (e.g. 0 == false, '0' == true, 'false' == true, etc.) makes it easier to introduce bugs. There's a lot of discipline/linting still required in PHP, whereas other languages won't even compile if you do something risky.
I wouldn't take that criticism too personally. People also argue that C/C++ can't ever be secure due to the ease of programmer mistakes.
I agree that there are safe (or safer) ways to do things in PHP, but in business settings, it's often hard to enforce such things. "Discipline" is very hard to maintain consistently across people with different backgrounds.
I've actually gone as far as rejecting commits that failed the linter and style checker, but that only works for code bases I have control over. It doesn't apply to the libs.
I now prefer languages where the entire ecosystem is passing strict compilation requirements.
Scalar type declarations don't help detect usage of undefined variables, as an example. Phan is the only tool I've found that does, but it's a very heavyweight thing, given how necessary it is to prevent undefined variable exceptions.
Yes, Psalm is great. We use it in a lot of our projects, and in the future will be using it in all of them. We absolutely love it. (It's on my "to blog about" queue.)
I guess that's an interesting nitpick, but I was referring to truthiness. I chose what I now see was a confusing way to express it as text.
What I really meant was that you have some $variable, and you test it in an if, while, etc. context. If the value is 0, it's false. But if you didn't realize you got a string from somewhere, it might be true. Unless it's an empty string? (I can't even remember.)
None of this changes the initial point, which is that writing "good" PHP requires enormous amounts of discipline, and that still doesn't address the security issues with PHP itself. I remember when the developer of Suhosin basically quit PHP and said he couldn't deal with the way its maintainers treated security issues.
Then your previous statement about PHP versions is irrelevant, and you should answer the question you were asked. To wit:
> Given how much has been deprecated/removed since 5.2, would you care to list some that remain?
For the version coming out at the end of the year, what is still insecure that leads you to believe "there is not an $x where that a statement is true"?
Sounds like you have no specifics to back up what you're saying and just enjoy bashing PHP because that is the trendy thing to do. You were asked for specifics several times and all you can offer are snarky pejoratives, it reads like textbook trolling.
A defense implies an attack, so someone has to start bashing PHP before anyone has an opportunity to defend it, however, I don't regard asking for specifics as a defense.
My stateless PHP projects are kinda enjoying the view over the GC and memleak hell happening on the other Java/NodeJS projects that I sadly have to deal with.
Most are WordPress, and it's not like WordPress has had a smooth development history... It's one of the most-exploited entry points into a server. That's why anyone running a server for long enough will see tons of hits to their access logs looking to /wp-admin and other WP-specific URIs.
This! I'm still seeing code snippets on StackOverflow that use hashing for password storage and queries without prepared statements.
I think at the moment that is still a weak spot of PHP.
Edit: my mistake. I was refering to weak hashes like md5 and sha1 or 'home brew' hashing.