Java : Char vs String byte size

20,753

Solution 1

getBytes() outputs the String with the default encoding (most likely ISO-8859-1) while the internal character char has always 2 bytes. Internally Java uses always char arrays with a 2 byte char, if you want to know more about encoding, read the link by Oded in the question comments.

Solution 2

I would like to say what i think,correct me if i am wrong but you are finding the length of the string which is correctly it is showing as 1 as you have only 1 character in the string. length shows the length not the size . length and size are two different things.

check this Link.. you are finding the number of bytes occupied in the wrong way

Share:
20,753
jayunit100
Author by

jayunit100

Current: Red Hat BigData, Apache BigTop commiter. Past: Phd in scalable, data driven bioinformatics analytics tools on the JVM, which led me into the world of big data as the genomic data space started to explode. After that, I was with peerindex as a hadoop mapreduce dev, and now I'm a big data engineer at redhat. We're making red hat storage awesome(r). blog: http://jayunit100.blogspot.com. github: http://github.com/jayunit100 pubs : https://www.researchgate.net/profile/Jay_Vyas/publications/?ev=prf_pubs_p2

Updated on January 09, 2021

Comments

  • jayunit100
    jayunit100 over 3 years

    I was surprised to find that the following code

    System.out.println("Character size:"+Character.SIZE/8);
    System.out.println("String size:"+"a".getBytes().length);
    

    outputs this:

    Character size:2

    String size:1

    I would assume that a single character string should take up the same (or more ) bytes than a single char.

    In particular I am wondering.

    If I have a java bean with several fields in it, how its size will increase depending on the nature of the fields (Character, String, Boolean, Vector, etc...) I'm assuming that all java objects have some (probably minimal) footprint, and that one of the smallest of these footprints would be a single character. To test that basic assumption I started with the above code - and the results of the print statements seem counterintuitive.

    Any insights into the way java stores/serializes characters vs strings by default would be very helpful.