Why is a SHA-1 Hash 40 characters long if it is only 160 bit?

60,229

Solution 1

One hex character can only represent 16 different values, i.e. 4 bits. (16 = 24)

40 × 4 = 160.


And no, you need much more than 5 characters in base-36.

There are totally 2160 different SHA-1 hashes.

2160 = 1640, so this is another reason why we need 40 hex digits.

But 2160 = 36160 log362 = 3630.9482..., so you still need 31 characters using base-36.

Solution 2

I think the OP's confusion comes from a string representing a SHA1 hash takes 40 bytes (at least if you are using ASCII), which equals 320 bits (not 640 bits).

The reason is that the hash is in binary and the hex string is just an encoding of that. So if you were to use a more efficient encoding (or no encoding at all), you could take only 160 bits of space (20 bytes), but the problem with that is it won't be binary safe.

You could use base64 though, in which case you'd need about 27-28 bytes (or characters) instead of 40 (see this page).

Solution 3

There are two hex characters per 8-bit-byte, not two bytes per hex character.

If you are working with 8-bit bytes (as in the SHA-1 definition), then a hex character encodes a single high or low 4-bit nibble within a byte. So it takes two such characters for a full byte.

Solution 4

2 hex characters mak up a range from 0-255, i.e. 0x00 == 0 and 0xFF == 255. So 2 hex characters are 8 bit, which makes 160 bit for your SHA digest.

Solution 5

My answer only differs from the previous ones in my theory as to the EXACT origin of the OP's confusion, and in the baby steps I provide for elucidation.

A character takes up different numbers of bytes depending on the encoding used (see here). There are a few contexts these days when we use 2 bytes per character, for example when programming in Java (here's why). Thus 40 Java characters would equal 80 bytes = 640 bits, the OP's calculation, and 10 Java characters would indeed encapsulate the right amount of information for a SHA-1 hash.

Unlike the thousands of possible Java characters, however, there are only 16 different hex characters, namely 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E and F. But these are not the same as Java characters, and take up far less space than the encodings of the Java characters 0 to 9 and A to F. They are symbols signifying all the possible values represented by just 4 bits:

0  0000    4  0100    8  1000    C  1100
1  0001    5  0101    9  1001    D  1101
2  0010    6  0110    A  1010    E  1110
3  0011    7  0111    B  1011    F  1111

Thus each hex character is only half a byte, and 40 hex characters gives us 20 bytes = 160 bits - the length of a SHA-1 hash.

Share:
60,229
AGrunewald
Author by

AGrunewald

Updated on July 09, 2022

Comments

  • AGrunewald
    AGrunewald almost 2 years

    The title of the question says it all. I have been researching SHA-1 and most places I see it being 40 Hex Characters long which to me is 640bit. Could it not be represented just as well with only 10 hex characters 160bit = 20byte. And one hex character can represent 2 byte right? Why is it twice as long as it needs to be? What am I missing in my understanding.

    And couldn't an SHA-1 be even just 5 or less characters if using Base32 or Base36 ?

  • NullUserException
    NullUserException almost 14 years
    You can have 5 characters if you use base 4.3e9
  • Ben Hocking
    Ben Hocking almost 14 years
    @NullUserException: Unfortunately, I run out of digits after about base 2.09e6.
  • AGrunewald
    AGrunewald almost 14 years
    Thank you so much for this explanation I knew that my math must have been off. Been a few years since I had to do some serious math.
  • Jeyekomon
    Jeyekomon over 5 years
    Isn't it interesting that if you more than double the character set (from 16 to 36), you would expect the resulting string to be more than a half of its original size, yet it actually does not happen (from 40 only to 31) ...