Convert a string representation of a hex dump to a byte array using Java?

549,034

Solution 1

Update (2021) - Java 17 now includes java.util.HexFormat (only took 25 years):

HexFormat.of().parseHex(s)


For older versions of Java:

Here's a solution that I think is better than any posted so far:

/* s must be an even-length string. */
public static byte[] hexStringToByteArray(String s) {
    int len = s.length();
    byte[] data = new byte[len / 2];
    for (int i = 0; i < len; i += 2) {
        data[i / 2] = (byte) ((Character.digit(s.charAt(i), 16) << 4)
                             + Character.digit(s.charAt(i+1), 16));
    }
    return data;
}

Reasons why it is an improvement:

  • Safe with leading zeros (unlike BigInteger) and with negative byte values (unlike Byte.parseByte)

  • Doesn't convert the String into a char[], or create StringBuilder and String objects for every single byte.

  • No library dependencies that may not be available

Feel free to add argument checking via assert or exceptions if the argument is not known to be safe.

Solution 2

One-liners:

import javax.xml.bind.DatatypeConverter;

public static String toHexString(byte[] array) {
    return DatatypeConverter.printHexBinary(array);
}

public static byte[] toByteArray(String s) {
    return DatatypeConverter.parseHexBinary(s);
}

Warnings:

  • in Java 9 Jigsaw this is no longer part of the (default) java.se root set so it will result in a ClassNotFoundException unless you specify --add-modules java.se.ee (thanks to @eckes)
  • Not available on Android (thanks to Fabian for noting that), but you can just take the source code if your system lacks javax.xml for some reason. Thanks to @Bert Regelink for extracting the source.

Solution 3

The Hex class in commons-codec should do that for you.

http://commons.apache.org/codec/

import org.apache.commons.codec.binary.Hex;
...
byte[] decoded = Hex.decodeHex("00A0BF");
// 0x00 0xA0 0xBF

Solution 4

You can now use BaseEncoding in guava to accomplish this.

BaseEncoding.base16().decode(string);

To reverse it use

BaseEncoding.base16().encode(bytes);

Solution 5

Actually, I think the BigInteger is solution is very nice:

new BigInteger("00A0BF", 16).toByteArray();

Edit: Not safe for leading zeros, as noted by the poster.

Share:
549,034
TommyTh
Author by

TommyTh

Updated on July 20, 2022

Comments

  • TommyTh
    TommyTh almost 2 years

    I am looking for a way to convert a long string (from a dump), that represents hex values into a byte array.

    I couldn't have phrased it better than the person that posted the same question here.

    But to keep it original, I'll phrase it my own way: suppose I have a string "00A0BF" that I would like interpreted as the

    byte[] {0x00,0xA0,0xBf}
    

    what should I do?

    I am a Java novice and ended up using BigInteger and watching out for leading hex zeros. But I think it is ugly and I am sure I am missing something simple.

  • palantus
    palantus almost 16 years
    The string concatenation is unnecessary. Just use Integer.valueOf(val, 16).
  • pfranza
    pfranza almost 16 years
    I've tried using the radix conversions like that before and I've had mixed results
  • palantus
    palantus almost 16 years
    Close, but this method fails on the given input "00A0BBF". See bugs.sun.com/bugdatabase/view_bug.do?bug_id=6259307.
  • palantus
    palantus almost 16 years
    First, you shouldn't need to convert the string to uppercase. Second, it is possible to append chars directly to a StringBuffer, which should be much more efficient.
  • TommyTh
    TommyTh almost 16 years
    Also strangely it does not deal with "9C"
  • TommyTh
    TommyTh almost 16 years
    thanks - oddly it works fine with this string: "9C001C" or "001C21" and fails with this one: "9C001C21" Exception in thread "main" java.lang.NumberFormatException: For input string: "9C001C21" at java.lang.NumberFormatException.forInputString(Unknown Source)
  • Marc Stober
    Marc Stober almost 16 years
    @mmyers: whoa. That's not good. Sorry for th confusion. @ravigad: 9C has the same problem because in this case the high bit is set.
  • TommyTh
    TommyTh almost 16 years
    I also thought so initially. And thank you for documenting it - I was just thinking I should... it did some strange things though that I didn't really understand - like omit some leading 0x00 and also mix up the order of 1 byte in a 156 byte string I was playing with.
  • Dave L.
    Dave L. almost 16 years
    That's a good point about leading 0's. I'm not sure I believe it could mix up the order of bytes, and would be very interested to see it demonstrated.
  • TommyTh
    TommyTh almost 16 years
    yeah, as soon as I said it, I didn't believe me either :) I ran a compare of the byte array from BigInteger with mmyers'fromHexString and (with no 0x00) against the offending string - they were identical. The "mix up" did happen, but it may have been something else. I willlook more closely tomorrow
  • Dave L.
    Dave L. almost 16 years
    This also looks good. See org.apache.commons.codec.binary.Hex.decodeHex()
  • TommyTh
    TommyTh almost 16 years
    It was interesting. But I found their solution hard to follow. Does it have any advantages over what you proposed (other than checking for even number of chars)?
  • hansvb
    hansvb over 14 years
    Thanks. There should be a built-in for this. Especially that Byte.parseByte croaks on negative values is cumbersome.
  • Gabriel Llamas
    Gabriel Llamas about 13 years
    Produces a wrong result. See the apache implementation in the below post.
  • Dave L.
    Dave L. about 13 years
    Can you give an example that is decoded incorrectly, or explain how it's wrong?
  • ovdsrn
    ovdsrn about 13 years
    It doesn't work for the String "0". It throws an java.lang.StringIndexOutOfBoundsException
  • ovdsrn
    ovdsrn about 13 years
    It works only if the input string (hexString) has an even number of characters. Otherwise: Exception in thread "main" java.lang.IllegalArgumentException: hexBinary needs to be even-length:
  • Dave L.
    Dave L. about 13 years
    "0" is not valid input. Bytes require two hexidecimal digits each. As the answer notes, "Feel free to add argument checking...if the argument is not known to be safe."
  • GrkEngineer
    GrkEngineer about 13 years
    Oh, thanks for pointing that out. A user really shouldn't have an odd number of characters because the byte array is represented as {0x00,0xA0,0xBf}. Each byte has two hex digits or nibbles. So any number of bytes should always have an even number of characters. Thanks for mentioning this.
  • BonanzaDriver
    BonanzaDriver over 12 years
    Thanks Michael - you're a life saver! Working on a BlackBerry project and trying to convert a string representation of a byte back into the byte ... using RIM's "Byte.parseByte( byteString, 16 )" method. Kept throwing a NumberFormatExcpetion. Spent hours tyring to figure out why. Your suggestion of "Integer.praseInt()" did the trick. Thanks again!!
  • Gray
    Gray over 12 years
    The issue with BigInteger is that there must be a "sign bit". If the leading byte has the high bit set then the resulting byte array has an extra 0 in the 1st position. But still +1.
  • Trevor Freeman
    Trevor Freeman about 12 years
    javax.xml.bind.DatatypeConverter.parseHexBinary(hexString) seems to be about 20% faster than the above solution in my micro tests (for whatever little they are worth), as well as correctly throwing exceptions on invalid input (e.g. "gg" is not a valid hexString but will return -77 using the solution as proposed).
  • Trevor Freeman
    Trevor Freeman about 12 years
    You can use java.xml.bind.DatatypeConverter.parseHexBinary(hexString) directly instead of using HexBinaryAdapter (which in turn calls DatatypeConverter). This way you do not have to create an adapter instance object (since DatatypeConverter methods are static).
  • Sean Coffey
    Sean Coffey over 11 years
    This is a different problem really, and probably belongs on another thread.
  • MuhammadAnnaqeeb
    MuhammadAnnaqeeb over 10 years
    I don't understand this: Safe with leading zeros (unlike BigInteger) and with negative byte values (unlike Byte.parseByte), how are these two unsafe? Could you give me any examples for testing, please?
  • Dave L.
    Dave L. over 10 years
    @MuhammadAnnaqeeb See other answers below that use BigInteger or Byte.parseByte
  • Chef Pharaoh
    Chef Pharaoh over 9 years
    @Dave Shouldn't this be using logical shift instead of arithmetic shift?
  • Dave L.
    Dave L. over 9 years
    @ChefPharaoh There is only a single left shift operator in Java which has the same effect as logical or arithmetic shift. See en.wikipedia.org/wiki/Logical_shift
  • rfornal
    rfornal over 9 years
    This question has been answered for a while and has several good alternatives in place; unfortunately, your answer does not provide any significantly improved value at this point.
  • Arun George
    Arun George over 9 years
    How does a hex dump looks like. Can some one give an example
  • Brett Ryan
    Brett Ryan over 9 years
    I'm wondering if data[i / 2] = (byte) (((Character.digit(s.charAt(i), 16) << 4) | Character.digit(s.charAt(i + 1), 16))); could be slightly more optimal? Or maybe a 0xFF mask could also be necessary.
  • DaedalusAlpha
    DaedalusAlpha about 9 years
    If the for-statement is changed to for (int i = 0; i < len - 1; i += 2) the function will no longer throw exceptions for data of invalid length. It still won't return correct data in those cases, but at least there is no longer a crash.
  • Dave L.
    Dave L. about 9 years
    @DaedalusAlpha It depends on your context, but usually I find it's better to fail fast and loud with such things so that you can fix your assumptions rather than silently returning incorrect data.
  • Priidu Neemre
    Priidu Neemre almost 9 years
    IMHO this should be the accepted/top answer since it's short and cleanish (unlike @DaveL's answer) and doesn't require any external libs (like skaffman's answer). Also, <Enter a worn joke about reinventing the bicycle>.
  • Rohan
    Rohan over 8 years
    Works like a charm. For the index of out of bounds issue prefix hex string with "0"'s to make the string to what ever size you need the output to be - for example if you are converting a 2 byte hex string do this before calling the method StringUtils.leftPad(hextString, 4, "0"). Two characters in hex string is converted to 1 byte.
  • Fabian
    Fabian over 8 years
    the datatypeconverter class is not available in android for example.
  • greybeard
    greybeard over 8 years
    (That's not more odd than in the Byte/byte case: highest bit set without leading -)
  • Prashant
    Prashant over 7 years
    Warning: in Java 9 Jigsaw this is no longer part of the (default) java.se root set so it will result in a ClassNotFoundException unless you specify --add-modules java.se.ee
  • dantebarba
    dantebarba over 7 years
    Amazing answer, could decode a base64 string request, used hex to convert the byte array with the provided solution and it worked like a charm
  • DragShot
    DragShot almost 7 years
    @dantebarba I think javax.xml.bind.DatatypeConverter already provides a method for encoding/decoding Base64 data. See parseBase64Binary() and printBase64Binary().
  • Jamey Hicks
    Jamey Hicks almost 7 years
    (byte)Short.parseShort(thisByte, 16) solves that problem
  • Stephen M -on strike-
    Stephen M -on strike- almost 7 years
    javax.xml.bind.* is no longer available in Java 9. The dangerous thing is code using it will compile under Java 1.8 or earlier (Java 9 with source settings to earlier), but get a runtime exception running under Java 9.
  • Stephen M -on strike-
    Stephen M -on strike- almost 7 years
    DatatypeConverter is also not available in Java 9 by default. The dangerous thing is code using it will compile under Java 1.8 or earlier (Java 9 with source settings to earlier), but get a runtime exception under Java 9 without "--add-modules java.se.ee".
  • Dmytro
    Dmytro about 6 years
    can confirm that depending on javax.xml.bind can cause java.lang.NoClassDefFoundError.
  • Erik Aronesty
    Erik Aronesty almost 6 years
    I added a reason... when developing for android/linux/windows/macosx/ios in one source base, this solution always works.
  • David Mordigal
    David Mordigal over 5 years
    To add to the issues with DataTypeConverter, Java SE 11 has removed the JAXB API entirely and is now only included with Java EE. You can also add it as a Maven dependency, as suggested here: stackoverflow.com/a/43574427/7347751
  • poby
    poby about 4 years
    I have tested many different methods but this one is at least twice as fast!
  • LarsH
    LarsH almost 4 years
    I would strongly disagree with '"0" is not valid input.' 0 is a perfectly valid hex string. Just because it isn't safe for this code doesn't mean it's not valid. The fact that each byte is normally expressed as two hex characters doesn't mean 0 isn't a valid hex string; hex strings can be converted to many things besides byte arrays. If the function requires an even-length hex string as input, that should be clearly documented, instead of leaving it to the user to analyze the code in order to know what its assumptions are. garethrees.org/2014/05/08/heartbleed
  • Dave L.
    Dave L. almost 4 years
    @LarsH Thanks for the reminder that it's always good to be clear about one's assumptions. In the context of this question, about a hex dump, with an example using pairs of characters, including leading zeros, I think it's a fair assumption that the data should represent bytes using two characters each. However, I can imagine domains where one would want to allow the special case of a single or leading '0' character instead. I agree that if this function is part of a public API that behavior either way should be documented, as it is in org.apache.commons.codec.binary.Hex and the XML datatypes
  • LarsH
    LarsH almost 4 years
    Fair enough. The question title does say "hex dump," not just "hex string." BTW I just used your code in a production project. (And added a check for odd-length strings.) So thanks.
  • LarsH
    LarsH almost 4 years
    P.S. I made a proposed edit to document the assumption. Obviously you can change it as desired.
  • Bhavin Chauhan
    Bhavin Chauhan over 3 years
    104207194088 TESTINGG 4304 GG2741 ��ôCONN���|��$GK23000023R00P08TESTINGG1042071940882010260720‌​03BVodafone IN��������������;�� - > Getting this "??" type of response while receiving data using datagram socket. Any suggestion where i am going wrong?
  • Synesso
    Synesso almost 3 years
    This is the goat.
  • dave_thompson_085
    dave_thompson_085 over 2 years
    For that matter you don't need any StringBuffer (which since 2004 could better be StringBuilder), just do new String (enc, i, 2)