Char size 8 bit or 16 bit?

32,997

Solution 1

A char in Java is a UTF-16 code unit. It's not necessarily a complete Unicode character, but it's effectively an unsigned 16-bit integer.

When you write text to a file (or in some other way convert it into a sequence of bytes), then the data will depend on which encoding you use. For example, if you use ASCII or ISO-8859-1 then you're very limited as to which characters you can write, but each character will only be a byte. If you use UTF-16, then each Java char will be converted into exactly two bytes - but some Unicode characters may take four bytes (those represented by two Java char values).

If you use UTF-8, then the length of even a single Java char in the encoded form will depend on the value.

Solution 2

There is a contemporary way to learn its size. Just print with BYTES.

System.out.println(Character.BYTES);

It results in 2

Share:
32,997
user3198603
Author by

user3198603

Updated on August 07, 2020

Comments

  • user3198603
    user3198603 almost 4 years

    http://docs.oracle.com/javase/tutorial/java/nutsandbolts/datatypes.html, char size is 16 bit i.e 2 byte. somehow i recalled its 8 bit i.e 1 byte. To clear my doubt, i created a text file with single character "a" and saved it. Then i inspected the size of file , its 1 byte i.e 8 bit. I am confused whats the size of character ? If its 2 byte , why file size is 1 byte and if it is 1 byte why link says 2 bytes?

  • Jon Skeet
    Jon Skeet almost 10 years
    What's your definition of "special"? Anything non-ASCII?
  • vogomatix
    vogomatix almost 10 years
    I was trying to keep my answer concise :-) for a full definition see Wikipedia
  • Jon Skeet
    Jon Skeet almost 10 years
    When "concise" means using such a hideously vague term as "special character", I don't think it's much use.
  • Pingpong
    Pingpong about 2 years
    Is c1 one byte, c2 two bytes? char c1 = (char) 255; char c2 = (char) 258;
  • Jon Skeet
    Jon Skeet about 2 years
    @Pingpong: No, char is a 16-bit data type whatever the value, just like int is a 32-bit data type whatever the value.