JAVA: get UTF-8 Hex values from a string?

17,897

Solution 1

Don't convert to an encoding like UTF-8 if you want the code point. Use Character.codePointAt.

For example:

Character.codePointAt("\u05D0\u05D1", 0) // returns 1488, or 0x5d0

Solution 2

Negative values occur because the range of byte is from -128 to 127. The following code will produce positive values:

String a = "\u05D0\u05D1";
byte[] xxx = a.getBytes("UTF-8");

for (byte x : xxx) {
    System.out.println(Integer.toHexString(x & 0xFF));
}

The main difference is that it outputs x & 0xFF instead of just x, this operation converts byte to int, dropping the sign.

Share:
17,897
thedp
Author by

thedp

A common startup dweller 🎈

Updated on June 15, 2022

Comments

  • thedp
    thedp almost 2 years

    I would like to be able to convert a raw UTF-8 string to a Hex string. In the example below I've created a sample UTF-8 string containing 2 letters. Then I'm trying to get the Hex values but it gives me negative values.

    How can I make it give me 05D0 and 05D1

    String a = "\u05D0\u05D1";
    byte[] xxx = a.getBytes("UTF-8");
    
    for (byte x : xxx) {
       System.out.println(Integer.toHexString(x));
    }
    

    Thank you.