Why not use MD5 for password hashing?

17,658

Solution 1

The collision resistance property of MD5 has been broken for a long time. Note that that preimage resistance and second preimage resistance have not yet been cracked, however as there are better algorithms out there (SHA-2) it would be wise to move to these rather than relying on a cryptographic hash that has already begun to lose its cryptographic properties. Note: The collision resistance property does not matter when storing hashed passwords - what you need to make sure of is that the preimage resistance property is sound - that it is computationally infeasible to find the original password given a certain hash value (and salt). As I mentioned, since one of the cryptographic properties is already broken I would be concerned that the others will follow soon.

When you store a password hash, you should build in some protection that the original password cannot be retrieved in case an attacker manages to extract these hashes. This is important because if an attacker manages to retrieve the password table only they can then use the data to either log into your system directly, or to log into other systems where the user has reused the same password.

When storing passwords, it is important to use a slow algorithm, such as bcrypt, scrypt or pbkdf2. A legitimate user should only have to experience the delay once, upon first login. An attacker will have to experience the delay for every password that they guess - remember rainbow tables will not be being used here because the passwords are salted. The attacker will be hashing each password guess in line with your chosen algorithm and iteration count.

It is important to tune the number of iterations for your system just so the right "strength" is used not to cause legitimate users any real annoyance when logging into your system. This is known as number of "rounds" or the "iteration count". For example, iterating for about one second should be sufficient. It may be safe to assume that an attacker can run through hashes at ten times the speed of your system's hardware. Therefore this limits the attacker to 10 guesses per second, rather than two billion with MD5.

Regarding DoS attacks

Yes, the extra processing your application performs upto login could be a target for an attacker to submit either really long passwords to your application, or to repeatedly hit it with login requests in order to consume CPU and memory resources on your server. You are right to be concerned.

These type of attacks can be mitigated in the following ways:

  • Log the username and IP address of each login attempt. After say 6 failed attempts, introduce a delay in response from your application if that username or IP is repeated again. This will also help mitigate password guessing attacks in general.
    • For example you could artificially delay by 1 second, then 2 seconds then 4 and up to a reasonable value (e.g. 16 seconds).
    • This has the advantage that an attacker can't lock out another account on purpose as the legitimate user only has to wait 16 seconds.
    • An attacker could use a botnet and random usernames to bypass these checks, however they would need a huge number of IP addresses than they would without this control, and a more casual attacker would also not be aware that the delay in response was artificial.
  • Monitor the number of login attempts on your system. Once this is above a set threshold rate (e.g. 10 per second), introduce a CAPTCHA to solve to continue the login process. The threshold rate you choose very much depends on the user base and capacity of your system.
  • Implement two factor authentication. Only proceed to validate password by hashing once the One Time Password has been validated.

Solution 2

The problem with MD5 is exactly that it is so fast, you can calculate about 9 Giga MD5/s with common hardware. To brute-force a whole english dictionary with about 200000 words you need only a fraction of a milli-second.

This is why appropriate hash algorithms like BCrypt offer a cost factor. The cost factor defines how much time is needed to calculate the hash and can be inreased in future. 50 milliseconds for a login is hardly an obstacle, but for brute-forcing it is deadly.

Share:
17,658
Masoud
Author by

Masoud

Updated on June 18, 2022

Comments

  • Masoud
    Masoud almost 2 years

    I have a friend which is a white hat hacker. He says that md5 is not really that bad and actually is really secure, just if we use it properly.

    I believe that he is right. As I know, there is 3 ways to break hashes:

    1. Using Rainbow tables (Which can be secured against by a long/random salt)
    2. Collision (Which can be prevented by multiple salts or hashes - as in example bellow)
    3. Generation time (Which is not much important if we use a long enough salt value per each user - AFAIK)

    I and my friend believe that Blowfish is not really needy, also it can be harmful because it can slow down password validation process and it can be used with DDOS attacks to break down server even with fewer attak-resources.

    So, I want to make sure that is following algorithm really secure or not? And, is there a real reason to go with Blowfish hash algorithm or not?

    // return a 256 bit salt + 128 bit md5 binary hash value
    function hash(password, salt=null)
    {
        salt = (salt != null) ? salt : Random256BitBinaryValueGenerator();
        // What about using another user-specified parameter, like email address as salt?
    
        return salt + md5(salt + password) + md5(password + salt);
    
        // Or just use a non-cryptographic hash algorithm like crc32 to prevent collisions:
        // return salt + md5(salt + password) + crc32(salt + password);
    
        // Or even use two different salts:
        // return salt + md5(salt + password) + md5('C' + salt + password);
    }
    
    // check password
    function check(password, hash_value)
    {
        return hash(password, substring(hash_value, 0, 32)) == hash_value;
    }
    
    • Masoud
      Masoud almost 9 years
      I know that SHA2 is secure, or Blowfish is even more secure, but I want to know the real reason to go with these newer algorithms, I mean, if we can make md5 secure, why go with another algorithm? According to following link, probability of just two hashes accidentally colliding is (1/2)^128, so md5(s+p) + md5(p+s) probability of colliding is (1/2)^256, which is very secure in my opinion, even for security-hungry applications when combined with a random salt. stackoverflow.com/questions/201705/…
    • Masoud
      Masoud almost 9 years
      Also, as I know, because of fixed length of hashs outputs, there is not a collision-free algorithm out there, so, if we didn't find any collisions in sha2 yet, it doesn't mean that sha2 is secure (which is not because what I said). So, why go with sha2 when we can secure md5? (If it is possible at all?)
    • CodesInChaos
      CodesInChaos almost 9 years
      What you want is bcrypt, not blowfish.
    • Masoud
      Masoud almost 9 years
      @CodesInChaos, bcrypt is based on blowfish.
    • CodesInChaos
      CodesInChaos almost 9 years
      @Masoud With the same reasoning you could say "C" when you actually mean "C#". Blowfish is a blockcipher, bcrypt is a password hash based on the blowfish key setup.
    • Masoud
      Masoud almost 9 years
      I said that bcrypt is based on blowfish too. Actually it is not really important that what bcrypt is and where it came from, but thank you any way for correcting me. I never say blowfish to bcrypt again, promise!
  • Masoud
    Masoud almost 9 years
    I always thought that we can make just a few thousands md5s/s even with good cpus. But 9 Giga MD5/s is really crazy!!!
  • martinstoeckli
    martinstoeckli almost 9 years
    @Masoud - This is really an impressive number, note that the linked software uses the GPU not the CPU. But nevertheless, everybody can buy some cheap GPUs and start the race...
  • martinstoeckli
    martinstoeckli almost 9 years
    @Masoud - Also have a look at the derivations (just some lines above on the linked page). There you can see that cracker tools offer a lot of possible combinations like md5($salt.$pass.$salt) out of the box.
  • Masoud
    Masoud almost 9 years
    Yeah. With a ~$200 gpu, we can calculate ~8.5 billion hashes per second. With a 10 chars length password (containing a-z, A-Z and 0-9 chars) we can break it in 13714 hours in average even with this software, and with a 12 chars length password, it needs 6000 years in average just to break a single hash value. Isn't it enough? ((26+26+10)^12/(8.5*10^9) / 60 / 60 / 24 / 365 / 2 = 6017 years) Isn't it better to force user to choose a better password with better strength instead of using more complex hash algorithms? didn't this algorithms become another md5 in future soon, based on moore's law?
  • Masoud
    Masoud almost 9 years
    Neil, In my opinion, an slow algorithm is not much secure. I have 2 reasons, one is explained in question (DDOS Attacks) and another is commented in @martinstoeckli answer. Can you please check these out and tell me if I'm wrong or not?
  • Masoud
    Masoud almost 9 years
    SilverlightFox, In my opinion, an slow algorithm is not much secure. I have 2 reasons, one is explained in question (DDOS Attacks) and another is commented in @martinstoeckli answer. Can you please check these out and tell me if I'm wrong or not?
  • martinstoeckli
    martinstoeckli almost 9 years
    @Masoud - A good question. • No it does not become the next MD5, because of the cost factor, increasing this factor by 1 will double the necessary time to calculate a hash. • And no, you can encourage the users to use stronger passwords, but you cannot force them. Complex password rules will conflict with good password schemes, and users are very inventive with passwords: Password-2015 will satisfy most rules, but is a very weak password. Even weak passwords should be protected as good as possible, even if they are not recommended.
  • Masoud
    Masoud almost 9 years
    Reason number 2 is: By using GPU, we can calculate ~8.5 billion hashes per second. With a 10 chars length password (containing a-z, A-Z and 0-9 chars) we can break it in 13714 hours in average even with this software, and with a 12 chars passwords, it needs 6000 years in average just to break a single hash value. Isn't it enough? ((26+26+10)^12/(8.5*10^9) / 60/60/24/365 / 2=6017 years) Isn't it better to force user to choose a better password with better strength instead of using more complex hash algorithms? didn't this algorithms become another md5 in future soon, based on moore's law?
  • Masoud
    Masoud almost 9 years
    Can't we implement algorithms to check real password strength, and then prevent users to use passwords like Password-2015? like using a dictionary or checking characters pronunciations? (e.g. prevent ini, oba, cio in passwords, but allow "gtz" which is not easy to pronounce) Also, I asked this question regarding the cost factor: stackoverflow.com/questions/30509081/…
  • Masoud
    Masoud almost 9 years
    Also, thank you for these very good advices. I used all of them, but in different ways. I set a constant delay of 2 seconds for all login attempts and after this 2 seconds, I check connection status to see if user is waiting for response or connection is dropped. And I introduce captcha after each fail login attempts per IP. Which your suggestions is much better than my way.
  • SilverlightFox
    SilverlightFox almost 9 years
    Remember, password strength is not a real metric - it depends on the sources of entropy actually used. Yes a keyspace attack on a 10 char password might take 517 days, however if you know the user hasn't chosen a truly random password as human's are not random then practically it won't take 517 days. You can't ever force a user to create a truly random password unless you specify it yourself - and that has its own problems.
  • SilverlightFox
    SilverlightFox almost 9 years
    That's why we choose an algorithm with a configurable amount of iterations. As Moore's law takes hold, you increase the number of iterations.
  • Masoud
    Masoud almost 9 years
    Ha ha, Good question. MD5 is fast, It seems that is an advantage, but by keeping @SilverlightFox mentioned advices in mind to prevent DoS & DDos attacks, it is not an advantage anymore.
  • Masoud
    Masoud almost 9 years
    As you and SilverlightFox helped me on this topic, and I can not accept both answers in stack overflow, I accept SilverlightFox answer, so, please remove your comment in another question and re-post it as answer, so I can accept that too. Thank you very much for helping me.
  • Masoud
    Masoud almost 9 years
    YES! Human's are not random, I remember that I read an article on that 2 or 3 years before, but I completely forgot that. Thank you for remembering that to me. Now I promise that I will not forget why MD5 is not secure: 1. MD5 is really fast 2. Human's are not random. Thank you very much for your help.
  • Neil Smithline
    Neil Smithline almost 9 years
    You have a link to the DDOS reference?