What's the big deal with brute force on hashes like MD5

12,901

Solution 1

You are talking about 2 distinct (although related) problems. First is the likely-hood of a collision, and the second is the ability to run the algorithm on tons of values to find the original value which created the hash.

  1. Collisions. If you run sha1(md5(text)) you first get the hash of md5, then pass that to sha1. Lets assume the sha1 function has a 128-bit output, and the md5 also has 128-bit output. Your chance of collision in the md5 function is 1/2^128. Then your chance of collision in the sha1 is 1/2^128. If either collides then the function overall collides and hence the result is (1/2^128) + (1/2^128) or 1/2^127
  2. Brute forcing. Running sha1(md5(text)) will only double the time it takes to find the original string. This is nothing in terms of security. FOr instance, if you have 128-bits of output space for each algorithm, and it takes 1 hour to brute force, then it will take 2 hours to run the same brute force twice to get the original string. This would be the same as increasing the output space to 129-bits. However, if you want to really make brute forcing impossible, what you have to do is double the output-size (which can be compared to the key size in encryption).

Solution 2

First of all md5 and sha1 are not encryption functions, they are message digest functions. Also most hashes are broken in real world using dictionary attacks like John The Ripper and Rainbow Crack.

John The Ripper is best suited for salted passwords where the attacker knows the salt value. Rainbow Crack is good for passwords with small unknown salts and straight hashes like md5($pass).

Rainbow Crack takes a long time to build the tables, but after that passwords break in a matter of seconds. It depends on how fast your disk drives are.

Solution 3

A collision attack (the type that's known against MD5, for example) does no real good. To be effective with regard to a password, you need a preimage attack (i.e. the ability to find some input that will hash to a known hash code). Though there are preimage attacks known against MD5, they're not currently practical.

Collision attacks are useful for entirely different purposes. One example that's been carried out is creating two X.509 certificates for two different identities that collide. Submit one to be signed by a certificate authority, and then you can use the other to claim that you're somebody else entirely. Since the hash will collide with the first, when/if a user tries to verify the certificate, it will show up as having been verified.

Solution 4

First not encryption creating Message Digest using the hash functions.

your question:

but can't you just encrypt (hash) your password using, say, MD5 and then, say, SHA-1 (or any other, doesn't matter.)

if the hash function does not provide any of these properties, it does not matter how many times you hashed, also the attacker can hash n times to get the collisions.

  1. For any given code h, it is computationally infeasible to find such x that H(x)=h, this property is called one way or preimage resistant.

  2. For any given block x ,it is computationally infeasible to find y≠x with H(y)=H(x).This property is referred second preimage resistant or weak collision resistant

  3. It is computationally infeasible to find any pear (x,y) such that H(x)=H(y). This is called Strong collision resistant.

So as The Rook mentioned, the passwords are stored by adding different salt values for each users. The dictionary gets longer and also computational overhead and time gets longer for the attacker if she exploits the password file.

Let's say attacker has the hashed values of the passwords, and starts reading from the dictionary file and compares with the hashed values if matches then pasword is cracked, if salt is used then read from the dictionary and add some salt value then try to find a match.However this should be done for each user. So the complexity that salt adds is (from wikipedia)

Assume a user’s (encrypted) secret key is stolen and he is known to use one of 200,000 English words as his password. The system uses a 32-bit salt. The salted key is now the original password appended to this random 32-bit salt. Because of this salt, the attacker’s pre-calculated hashes are of no value. He must calculate the hash of each word with each of 2^32 (4,294,967,296) possible salts appended until a match is found. The total number of possible inputs can be obtained by multiplying the number of words in the dictionary with the number of possible salts: alt text

if H(password+salt)(in system)=H(Your password+salt) (login process)
login else
print<<error

Solution 5

When you hash a password multiple times you actually increase the chance of hash collisions, so best practice is to hash only once.

It also has nothing to do with how easy it will be to perform a brute-force attack. Such an attack will systematically try every possible password within a given range. Thus, if your password is "foobar" and the attack tests the password "foobar" it wont matter how or how many times you hashed the password, because the brute-force attack successfully guessed it.

Therefore, if you wish to guard yourself against a brute-force attack, you could limit how often a user can attempt authorization or require passwords to be above a certain length.

On a side note; Rainbow Tables and similar methods are used by hackers that have already gained access to your database and are meant to decrypt the stored password. In order make such an attack more difficult, you should use static and dynamic salts.

Share:
12,901
Jan K.
Author by

Jan K.

I'm a consultant with focus on analytics, performance management and digital transformation. In my spare time I like to dabble in Android and web applications and frequent SO in these topics.

Updated on July 19, 2022

Comments

  • Jan K.
    Jan K. almost 2 years

    I just spent some time reading https://stackoverflow.com/questions/2768248/is-md5-really-that-bad (I highly recommend!).

    In it, it talks about hash collisions. Maybe I'm missing something here, but can't you just encrypt your password using, say, MD5 and then, say, SHA-1 (or any other, doesn't matter.) Wouldn't this increase the processing power required to brute-force the hash and reduce the possibility of collision?

  • tster
    tster almost 14 years
    Hashing a hash would not reduce the probability of a collision.
  • evilertoaster
    evilertoaster almost 14 years
    Particularly when using different algorithms it could. I don't think a blanket statement like that is true in all cases...or at least, an example could be formulated where hashing a hash does reduce possibility.
  • Stephen P
    Stephen P almost 14 years
    No, it doesn't reduce the probability of a collision, whether or not different algorithms are used. Using the example of sha1(md5($pass)), if pass=="X" and pass=="Y" produce a collision under md5(), the same md5 output text is passed into the sha1() hash so it is guaranteed to produce the same output - i.e. a collision. Worse, md5("X") and md5("Y") could produce different output, which when fed to sha1() happens to produce its own collision so, as @Zackman said, you actually increase the probability of a collision.
  • evilertoaster
    evilertoaster almost 14 years
    I agree with your example, perhaps it should delete my answer or at least edit it to say "could reduce"... My initial thought was md5($pass) vs something like md5(sha512($pass)) with short input (shorter than the output of sha512). In this case wouldn't the latter be less prone to collisions than the former (although maybe not less prone than just sha512 itself), since it's essentially adding more bits to the md5 than just the input itself?
  • rook
    rook almost 14 years
    The use of salts does not prevent the breaking of hashes. It does however make precomputed attacks more resource intensive. John The Ripper is not a precomputed attack, and if the attacker has the salt it is no more difficult to break than a straight hash md5($pass).
  • Nick Johnson
    Nick Johnson over 13 years
    "When you hash a password multiple times you actually increase the chance of hash collisions, so best practice is to hash only once." - this is completely untrue. If it were true, secure hash functions wouldn't be.
  • sillyMunky
    sillyMunky over 12 years
    Um, this answer is not true with respect to collisions. Collisions in the md5 space will transfer to the sha1 function in the sense that the reduced entropy of the md5 function implies a reduced entropy of input into the sha1 function and therefore a reduced entropy in the total number of hashes this pair of functions is able to produce. The 1/2^128 is completely wrong.
  • tster
    tster over 12 years
    @sillyMunky, yup, you're right. The likelyhood of a collision increases from 1/2^128 to 2/2^128 (aka 1/2^127). So adding sha1 of the md5 hash increases chances of collision. I've updated the answer to reflect this.
  • Rasmus Faber
    Rasmus Faber over 12 years
    It is worse than that regarding collisions. Any collision for MD5 (MD5(text1)=MD5(text2)) is trivially also a collision for SHA1+MD5 (SHA1(MD5(text1))=SHA1(MD5(text2)). So since MD5 is broken in regards to collision, SHA1+MD5 is also broken.
  • sillyMunky
    sillyMunky over 12 years
    If you really want to reduce the possibility of collision, you need to take both md5 and sha1 hashes independently and check both. Then any collision found in one algorithm would also have to be a collision in the other or one of the checks would fail. This massively reduces the probability of any collision being useful to any attacker (or happening by accident).