How to calculate the hash code of a string by hand?
Solution 1
Take a look at the source code of java.lang.String
.
/**
* Returns a hash code for this string. The hash code for a
* <code>String</code> object is computed as
* <blockquote><pre>
* s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]
* </pre></blockquote>
* using <code>int</code> arithmetic, where <code>s[i]</code> is the
* <i>i</i>th character of the string, <code>n</code> is the length of
* the string, and <code>^</code> indicates exponentiation.
* (The hash value of the empty string is zero.)
*
* @return a hash code value for this object.
*/
public int hashCode() {
int h = hash;
int len = count;
if (h == 0 && len > 0) {
int off = offset;
char val[] = value;
for (int i = 0; i < len; i++) {
h = 31*h + val[off++];
}
hash = h;
}
return h;
}
Solution 2
Most hash functions of this sort calculate the hash value modulo some large number (e.g. a large prime). This avoids overflows and keeps the range of values returned by the function within a specified range. But this also means an infinite range of input values will get a hash value from a finite set of possible values (i.e. [0,modulus)), hence the problem of hash collisions.
In this case, the code would look something like this:
public int hash(String x){
int hashcode=0;
int MOD=10007;
int shift=29;
for(int i=0;i<x.length();i++){
hashcode=((shift*hashcode)%MOD+x.charAt(i))%MOD;
}
return hashcode;
}
Exercise for the reader:
See the code for the hashCode
function for java.util.String. Can you see why it does not use a modulus explicitly?
Solution 3
The following statements will find the string hashCode
String str="Hi";
int a = str.hashCode();//returns 2337
Let's check how exactly its calculated
HashCode = s[0]*31(n-1) + s[1]*31(n-2) + .. s(n-2)
As we all know that the character at position 0 is H, Character at position 1 is i, and the string length is 2.
==> H*31(2-1) + i*31(2-2)
As we all know that, ASCII code of H is 72, and i is 105. It means,
==> 72 * 31 + 105 * 1 (Anything Power 0 is 1)
==> 2232 + 105 = 2337
Source: https://www.tutorialgateway.org/find-string-hashcode-in-java/
thomascirca
Updated on July 09, 2022Comments
-
thomascirca almost 2 years
I was wondering how to calculate the hash code for a given string by hand. I understand that in Java, you can do something like:
String me = "What you say what you say what?"; long whatever = me.hashCode();
That's all good and dandy, but I was wondering how to do it by hand. I know the given formula for calculating the hash code of a string is something like:
S0 X 31 ^ (n-1) + S1 X 31 ^ (n-2) + .... + S(n-2) X 31 + S(n-1)
Where S indicates the character in the string, and n is the length of the string. Using 16 bit unicode then, the first character from string me would be computed as:
87 X (31 ^ 34)
However, that creates an insanely large number. I can't imagine adding all the characters together like that. So, in order to calculate the lowest-order 32 bits result then, what would I do? Long whatever from above equals -957986661 and I'm not how to calculate that?
-
thomascirca almost 14 yearsI get the basic idea (I can compute small strings) but when the string gets large I'm unsure as to what to do.
-
Vishy almost 14 years@user458346, the size of the string isn't important. This is the values of using a loop, it doesn't matter how long the loop is, it does get any more complicated.
-
jjczopek almost 14 yearsI can't see... Could You explain it?
-
MAK almost 14 years@jjczopek: Notice that
x%2^n = x&(2^n-1)
. So if you did arithmetic modulo 2^n, you just need to keep the last n bits of your value, discarding any higher bits. Now think what happens when you just use anint
to represent your value. Any arithmetic you do will result in only the last 32 bits being left over. Voila, you have arithmetic modulo 2^32. -
deceleratedcaviar over 12 yearsRight. How did you not see that jjczopek >_<.
-
paradox over 4 yearswhere is offset been initiated?