How can I generate an MD5 hash in Java?
Solution 1
You need java.security.MessageDigest
.
Call MessageDigest.getInstance("MD5")
to get a MD5 instance of MessageDigest
you can use.
The compute the hash by doing one of:
- Feed the entire input as a
byte[]
and calculate the hash in one operation withmd.digest(bytes)
. - Feed the
MessageDigest
onebyte[]
chunk at a time by callingmd.update(bytes)
. When you're done adding input bytes, calculate the hash withmd.digest()
.
The byte[]
returned by md.digest()
is the MD5 hash.
Solution 2
The MessageDigest
class can provide you with an instance of the MD5 digest.
When working with strings and the crypto classes be sure to always specify the encoding you want the byte representation in. If you just use string.getBytes()
it will use the platform default. (Not all platforms use the same defaults)
import java.security.*;
..
byte[] bytesOfMessage = yourString.getBytes("UTF-8");
MessageDigest md = MessageDigest.getInstance("MD5");
byte[] theMD5digest = md.digest(bytesOfMessage);
If you have a lot of data take a look at the .update(xxx)
methods which can be called repeatedly. Then call .digest()
to obtain the resulting hash.
Solution 3
If you actually want the answer back as a string as opposed to a byte array, you could always do something like this:
String plaintext = "your text here";
MessageDigest m = MessageDigest.getInstance("MD5");
m.reset();
m.update(plaintext.getBytes());
byte[] digest = m.digest();
BigInteger bigInt = new BigInteger(1,digest);
String hashtext = bigInt.toString(16);
// Now we need to zero pad it if you actually want the full 32 chars.
while(hashtext.length() < 32 ){
hashtext = "0"+hashtext;
}
Solution 4
You might also want to look at the DigestUtils class of the apache commons codec project, which provides very convenient methods to create MD5 or SHA digests.
Solution 5
Found this:
public String MD5(String md5) {
try {
java.security.MessageDigest md = java.security.MessageDigest.getInstance("MD5");
byte[] array = md.digest(md5.getBytes());
StringBuffer sb = new StringBuffer();
for (int i = 0; i < array.length; ++i) {
sb.append(Integer.toHexString((array[i] & 0xFF) | 0x100).substring(1,3));
}
return sb.toString();
} catch (java.security.NoSuchAlgorithmException e) {
}
return null;
}
on the site below, I take no credit for it, but its a solution that works! For me lots of other code didnt work properly, I ended up missing 0s in the hash. This one seems to be the same as PHP has. source: http://m2tec.be/blog/2010/02/03/java-md5-hex-0093
Comments
-
Sigmund Reed about 2 years
I have used material from here and a previous forum page to write some code for a program that will automatically calculate the semantic similarity between consecutive sentences across a whole text. Here it is;
The code for the first part is copy pasted from the first link, then I have this stuff below which I put in after the 245 line. I removed all excess after line 245.
with open ("File_Name", "r") as sentence_file: while x and y: x = sentence_file.readline() y = sentence_file.readline() similarity(x, y, true) #boolean set to false or true x = y y = sentence_file.readline()
My text file is formatted like this;
Red alcoholic drink. Fresh orange juice. An English dictionary. The Yellow Wallpaper.
In the end I want to display all the pairs of consecutive sentences with the similarity next to it, like this;
["Red alcoholic drink.", "Fresh orange juice.", 0.611], ["Fresh orange juice.", "An English dictionary.", 0.0] ["An English dictionary.", "The Yellow Wallpaper.", 0.5] if norm(vec_1) > 0 and if norm(vec_2) > 0: return np.dot(vec_1, vec_2.T) / (np.linalg.norm(vec_1)* np.linalg.norm(vec_2)) elif norm(vec_1) < 0 and if norm(vec_2) < 0: ???Move On???
-
Leif Gruenwoldt almost 12 years
-
rustyx over 9 yearsMD5 might be unsafe as a one-way security feature, but it is still good for generic checksum applications.
-
Admin over 7 years
dict.has_key
has been deprecated for nearly a decade, now: docs.python.org/3.0/whatsnew/3.0.html#builtins -
Sigmund Reed over 7 yearsSorry so is the the only problem and if so how can I fix it? Probably a stupid q. but I'm really new to Python.
-
Admin over 7 yearsMy previous comment contained a link. Click on the link. Look at the page contained therein. Read the bullet point about
dict.has_key()
. -
Admin over 7 yearsHint: what is meant by "
dict.has_key()
has been deprecated" is that you can no longer call thehas_key
method on a dictionary. Instead, use thein
membership operator. docs.python.org/3/reference/… -
Sigmund Reed over 7 yearsHi, I apologize but Python is still very new for me. I swapped hypernyms_2.has_key(lcs_candidate): for hypernyms_2.in(lcs_candidate): it said invalid syntax
-
Admin over 7 yearsThat's because
in
is an operator, not a method. Trylcs_candidate in hypernyms_2
-
Sigmund Reed over 7 yearsSorry again, I fixed that stuff (thank you so much) but then I get this. Look in the comments please.
-
Admin over 7 yearsI suspect that's caused by dividing by zero somewhere... Also, cosine similarity is built in to SciPy: docs.scipy.org/doc/scipy/reference/generated/…
-
Sigmund Reed over 7 yearsWhat would you suggest to fix that mess? Preferably without using scipy and sticking to the code I have already.
-
Admin over 7 yearsCheck to make sure that neither of
vec_1
norvec_2
are the zero vector (ie have length zero) before calculating the cosine similarity. Just useif
/else
...ie if the norms of the vectors are both positive, then you're good to go, otherwise...well, skip that pair or throw an exception or...do what you want to do. -
Admin over 7 yearsIf you don't want to use SciPy to calculate the cosine similarity, then that's fine, too...calculating the dot product and dividing by the product of the norms works as well. Just make sure that both of the norms are positive.
-
Admin over 7 yearsAlso, it's worth pointing out that you only got a warning, not an exception (ie your code kept going). Testing on my end indicates that
np.nan
(ie NumPy'snan
value--nan
meaning "not a number") would be returned whenvec_1
orvec_2
have a norm of zero. -
Sigmund Reed over 7 yearsThis is going to be really annoying but I'm a linguistics professor with minimum to no Python experience, how would this be done? I realize how sickening I am but I can't find any other help on short notice. Also nothing was returned not even nan.
-
Admin over 7 yearsWell, what do you want to do if you encounter a vector with norm zero when computing the cosine similarities? Throw an error and quit? Silently continue with the next pair (assuming that you're computing these inside some
for
loop, which may or may not be the case)? That's not a question that I can answer. You have to decide the flow of logic for your code. -
Admin over 7 yearsYou can also just let the warnings be thrown and deal with the
nan
values in the output afterwards. -
Sigmund Reed over 7 yearsI tried something in the comments, it is obviously erroneous. Also don't know how to implement.
-
Admin over 7 yearsNorms of vectors are never negative.... So, your
elif norm(vec_1) < 0 and if norm(vec_2) < 0:
can just be anelse:
-
Admin over 7 yearsAlso,
if norm(vec_1) > 0 and if norm(vec_2) > 0:
is invalid syntax. anh.cs.luc.edu/python/hands-on/3.1/handsonHtml/… -
Admin over 7 yearsIncidentally, I don't know what you're using to write your code, but you might want to use an IDE (integrated development environment) or text editor with the ability to point out simple syntax errors. I'd recommend PyCharm: jetbrains.com/pycharm (there's a free and not-free edition...the free edition will be more than adequate for what you're trying to do).
-
-
Akshay over 15 yearsCould you point me to some resources, where i can read about relative merits and weaknesses of each?
-
Bombe over 15 years“LATIN1” != “ASCII” (or “US-ASCII”). ASCII is a 7-bit character set, Latin1 is an 8-bit character set. They are not the same.
-
Rob over 15 yearsIn particular, the methods which return "safe" encoded representations of the byte data in string form.
-
Piskvor left the building over 15 years(see joelonsoftware.com/articles/Unicode.html for much better rationale and explanation)
-
Spidey almost 14 years@BalusC: Not true, the BigInteger.toString method will return the full number in the base specified. 0x0606 will be printed as 606, just trailing zeros are omitted,
-
squiddle over 13 yearsIf you use Apache Commons Codec anyway you can use: commons.apache.org/codec/api-release/org/apache/commons/codec/…
-
David Leppik about 13 yearsSHA1 is overkill unless you want a cryptographically secure hash, i.e. you don't want the hash to help in reconstructing the original message, nor do you want a clever attacker to create another message which matches the hash. If the original isn't a secret and the hash isn't being used for security, MD5 is fast and easy. For example, Google Web Toolkit uses MD5 hashes in JavaScript URLs (e.g. foo.js?hash=12345).
-
David Leppik about 13 yearsMinor nitpick: m.reset() isn't necessary right after calling getInstance. More minor: 'your text here' requires double-quotes.
-
bluish about 13 yearsI would replace last line with this:
String result = Hex.encodeHexString(resultByte);
-
Paŭlo Ebermann almost 13 yearsYou should specify the encoding to be used in
getBytes()
, otherwise your code will get different results on different platforms/user settings. -
iuiz almost 13 yearsHowever there is no easy way to get the DigestUtils class into your project without adding a ton of libs, or porting the class "per hand" which requires at least two more classes.
-
sparkyspider over 12 yearsCan't find it in maven repos either. Grrrr.
-
Nick Spacek over 12 yearsShould be in the central Maven repositories, unless I'm going crazy: groupId=commons-codec artifactId=commons-codec version=1.5
-
Jeremy Huiskamp over 12 yearsWhat makes you think file integrity is not a security issue?
-
weekens about 12 yearsThis topic is also useful if you need to convert the resulting bytes to hex string.
-
kriegaex almost 12 yearsOh BTW, before anyone except for myself notices how bad my JRE knowledge really is: I just discovered DigestInputStream and DigestOutputStream. I am going to edit my original solution to reflect what I have just learned.
-
Nilzor over 11 yearsWhy has this answer -1 while the other, shorter and less descriptive answer has +146?
-
mjuarez over 11 yearsOne thing that's not mentioned here, and caught me by surprise. The MessageDigest classes are NOT thread safe. If they're going to be used by different threads, just create a new one, instead of trying to reuse them.
-
Dave.B over 11 yearsNice using BigInteger to get a hex value +1
-
kovica about 11 yearsI just found out that in some cases this only generates 31 characters long MD5 sum, not 32 as it should be
-
Heshan Perera about 11 years@kovica this is because, the starting zeros get truncated if I remember right..
String.format("%032x", new BigInteger(1, hash));
This should solve this. 'hash' is the byte[] of the hash. -
Bombe about 11 yearsIt uses multiple methods to mutate its internal state. How can the lack of thread safety be surprising at all?
-
Blaze Tama over 10 years@PaŭloEbermann does MessageDigest.getInstance("MD5"); not enough? I tried to add "MD5" in getBytes() but it returned an error
-
Paŭlo Ebermann over 10 years@BlazeTama "MD5" is not an encoding, it is a message digest algorithm (and not one which should be used in new applications). An encoding is an algorithm pair which transforms bytes to strings and strings to bytes. An example would be "UTF-8", "US-ASCII", "ISO-8859-1", "UTF-16BE", and similar. Use the same encoding as every other party which calculates a hash of this string, otherwise you'll get different results.
-
Ajax over 10 yearsThis is a solid, standalone library with minimal dependencies. Good stuff.
-
alex about 10 yearsand
String.format("%1$032X", big)
to have an uppercase format -
Dan Barowy almost 10 years@Bombe: why should we expect to have to know about MessageDigest's internal state?
-
Bombe almost 10 years@DanBarowy well, you are mutating it (i.e. calling methods that do not return values but cause other methods to return different values) so until proven otherwise you should always assume that it’s not thread-safe to do so.
-
Richard over 9 yearsFor an example of the character set... (use UTF-8, that is the best and most compatible in my opinion)...
byte[] array = md.digest(md5.getBytes(Charset.forName("UTF-8")));
-
rwitzel over 9 yearsThis is the method that provides the same return value as the MySQL function md5(str). A lot of the other answers did return other values.
-
EpicPandaForce over 9 yearsThis doesn't work right on Android because Android bundles commons-codec 1.2, for which you need this workaround: stackoverflow.com/a/9284092/2413303
-
Gelldur about 9 yearsThis answer has bug with charset type!
-
Jannick almost 9 yearsThis is probably the worst solution as it strips leading zeros.
-
user253751 almost 9 years@Traubenfuchs
MessageDigest
allows you to input the data in chunks. That wouldn't be possible with a static method. Although you can argue they should have added one anyway for convenience when you can pass all the data at once. -
ASA almost 9 yearsMakes sense. I guess you wouldn't always want to move around byte arrays with multiple Gigabytes! Still, just let it take a stream.
-
supernova almost 9 years@HeshanPerera How come you mentioned in your answer "getting a String representation back from an MD5 hash"!!? But your code shows logic to convert String to Md5 hash. If I am not wrong MD5 hash is a one way algorithm and it can't be converted back to original String.
-
albanx over 8 yearsand what about if I want the pure string ?
-
Kurt Alfred Kluever over 8 yearsor using one of the shortcut methods:
Hashing.md5().hashString("my string").asBytes();
-
Daniel Kamil Kozar over 8 years@albanx : there is no such thing as a "pure string", unless you meant the serialized contents of the Java object itself. Please refer to the previously posted link to Joel On Software.
-
albanx over 8 years@DanielKamilKozar I needed the hex string to save in db. dac2009 has posted the solution for this
-
Nacho about 8 yearsWelcome to StackOverflow, you might want to read how to post an answer before doing so. Give a bit of context explaining why you posted that code and what does it do. Also consider taking the time to format your answer to be easily understood by readers.
-
Justin about 8 years@KurtAlfredKluever don't forget to insert the charset like 'Hashing.md5().hashString("my string", Charsets.UTF_8).asBytes()'
-
bkrish about 8 yearsI found it very useful. It took 15357 ms for a 4.57GB file whereas java inbuilt implementation took 19094 ms.
-
kbolino about 8 years@Traubenfuchs and what would it do with the bytes that it read from that stream, throw them away?
-
ASA about 8 yearsI believe back when I wrote this I thought about an "finalized" & "ready for consumption" InputStream that would be fully drained by the static method. Any necessary state would be saved in the method body.
-
Markus Pscheidt about 8 yearsGreat. It doesn't fall into the trap of cutting leading zeros.
-
Joabe Lucena over 7 yearsAs @CedricSimon said, that's exactly what I was looking for. Upvote here.. Thanks!
-
Sigmund Reed over 7 yearsHello, I added the code but I got these errors (written in the question)
-
holmis83 over 7 yearsOnly one-liner I've seen that doesn't use an external library.
-
Isuru about 7 yearsIs this Kotlin language?
-
walshie4 almost 7 yearsUnless I'm mistaken this returns always in uppercase which will not align with md5's made without using hex. Not even really sure it is a true md5
-
gildor over 6 years@Isuru looks like Scala
-
Ilya Serbis over 6 yearsactually it accepts not only MD5 bytes array (size == 16). You can pass byte array of any length. It will be converted to MD5 bytes array by means of MD5
MessageDigest
(see nameUUIDFromBytes() source code) -
Humphrey over 6 yearsThis was very easy and simple I would recommend this for all of the visitors
-
Humphrey over 6 yearsThen how do u convert this thedigest to a string so that we can insert it in mysql ?
-
Fran Marzoa over 6 yearsThis does not answer the question, it's just a couple of links. stackoverflow.com/help/how-to-answer
-
Fran Marzoa over 6 yearsBeware this won't work for Android if you're using API level < 19, but you just need to change the second line with md5.update(string.getBytes("UTF-8")); This will add yet another checked exception to handle, though...
-
James about 6 yearsBTW: The performance of this is much better then using BigInteger to create the hex string representation.
-
Hummeling Engineering BV over 5 yearsBetter yet, where possible use
yourString.getBytes(StandardCharsets.UTF_8)
. This prevents handling anUnsupportedEncodingException
. -
tom about 5 yearsFrom Java 11 on, you can use
hashtext = "0".repeat(32 - hashtext.length()) + hashtext
instead of thewhile
, so the editors won't give you a warning that you're doing string concatenation inside a loop. -
dac2009 almost 5 yearsSince its not my solution, and I didnt test all scenarios myself, I will leave it unchanged, although I think specifiying encoding etc is probably a good idea.
-
JGFMK over 4 yearsThis seems far superior. You don't even have to capture as many exceptions either.
-
user1819780 about 4 yearsInstead of m.update(plaintext.getBytes()); I would recommend specifying the encoding. such as m.update(plaintext.getBytes("UTF-8")); getBytes() does not guarantee the encoding and may vary from system to system which may result in different MD5 results between systems for the same String.
-
Arundale Ramanathan almost 4 yearsThis was very useful. I was having problems with MessageDigest.getInstance("MD5").
-
Logesh S almost 3 yearsWorked perfectly for Gravatar's email MD5 hash!, Thank you