Representing char as a byte in Java
Solution 1
To convert characters to bytes, you need to specify a character encoding. Some character encodings use one byte per character, while others use two or more bytes. In fact, for many languages, there are far too many characters to encode with a single byte.
In Java, the simplest way to convert from characters to bytes is with the String
class's getBytes(Charset)
method. (The StandardCharsets
class defines some common encodings.) However, this method will silently replace characters with � if the character cannot be mapped under the specified encoding. If you need more control, you can configure a CharsetEncoder
to handle this case with an error or use a different replacement character.
Solution 2
A char is indeed 16 bits in Java (and is also the only unsigned type!!).
If you are sure the encoding of your characters is ASCII, then you can just cast them away on a byte (since ASCII uses only the lower 7 bits of the char).
If you do not need to modify the characters, or understand their signification within a String, you can just store chars on two bytes, like:
char[] c = ...;
byte[] b = new byte[c.length*2];
for(int i=0; i<c.length; i++) {
b[2*i] = (byte) (c[i]&0xFF00)>>8;
b[2*i+1] = (byte) (c[i]&0x00FF);
}
(It may be advisable to replace the 2* by a right shift, if speed matters).
Note however that some actual (displayed) characters (or, more accurately, Unicode code-points) are written on two consecutive chars. So cutting between two chars does not ensure that you are cutting between actual characters.
If you need to decode/encode or otherwise manipulate your char array in a String-aware manner, you should rather try to decode and encode your char array or String using the java.io tools, that ensure proper character manipulation.
Solution 3
To extend what others are saying, if you have a char that you need as a byte array, then you first create a String containing that char and then get the byte array from the String:
private byte[] charToBytes(final char x) {
String temp = new String(new char[] {x});
try {
return temp.getBytes("ISO-8859-1");
} catch (UnsupportedEncodingException e) {
// Log a complaint
return null;
}
}
Of course, use the appropriate character set. Much more efficient that this would be to start working with Strings rather than take a char at a time, convert to a String, then convert to a byte array.
jbu
Updated on July 28, 2022Comments
-
jbu almost 2 years
I must convert a char into a byte or a byte array. In other languages I know that a char is just a single byte. However, looking at the Java Character class, its min value is \u0000 and its max value is \uFFFF. This makes it seem like a char is 2 bytes long.
Will I be able to store it as a byte or do I need to store it as two bytes?
Before anyone asks, I will say that I'm trying to do this because I'm working under an interface that expects my results to be a byte array. So I have to convert my char to one.
Please let me know and help me understand this.
Thanks, jbu