In C, why is sizeof(char) 1, when 'a' is an int?

24,029

Solution 1

In C 'a' is an integer constant (!?!), so 4 is correct for your architecture. It is implicitly converted to char for the assignment. sizeof(char) is always 1 by definition. The standard doesn't say what units 1 is, but it is often bytes.

Solution 2

Th C standard says that a character literal like 'a' is of type int, not type char. It therefore has (on your platform) sizeof == 4. See this question for a fuller discussion.

Solution 3

It is the normal behavior of the sizeof operator (See Wikipedia):

  • For a datatype, sizeof returns the size of the datatype. For char, you get 1.
  • For an expression, sizeof returns the size of the type of the variable or expression. As a character literal is typed as int, you get 4.

Solution 4

This is covered in ISO C11 6.4.4.4 Character constants though it's largely unchanged from earlier standards. That states, in paragraph /10:

An integer character constant has type int. The value of an integer character constant containing a single character that maps to a single-byte execution character is the numerical value of the representation of the mapped character interpreted as an integer.

Share:
24,029

Related videos on Youtube

legends2k
Author by

legends2k

I roam around hunting for bits in mathematics, graphics, C++, C, Lua, Python and design-related questions; have learnt a lot here. I like giving elaborate answers with examples and citations. Here are some I enjoyed answering and asking. Answers: Rotating a Vector in 3D space Calculating a perpendicular offset from a diagonal line Deriving OpenGL's utility function gluLookAt How can a program be OS-independent? Efficiently calculating rotation matrix from 2 vectors In OpenGL why give gluPerspective before gluLookAt? How are textures referenced in shader? How many pixels is a meter? Rendering the HSV wheel How to interpolate hue values in HSV colour space? Shift operator in C Passing 2D array to a function Lua multiple assignment with tables Difference between move and forward Taking std::function with templatized parameter as a parameter Linkage of symbols within anonymous namespace within a regular namespace Confusion on strings in C Questions: Purpose of Unions Why is (void)0 a no-op in C and C++? When is a conditional variable needed, isn't a mutex enough? Mixing extern and const I can type 77 words per minute :)

Updated on November 15, 2020

Comments

  • legends2k
    legends2k over 3 years

    I tried

    printf("%d, %d\n", sizeof(char), sizeof('c'));

    and got 1, 4 as output. If size of a character is one, why does 'c' give me 4? I guess it's because it's an integer. So when I do char ch = 'c'; is there an implicit conversion happening, under the hood, from that 4 byte value to a 1 byte value when it's assigned to the char variable?

    • Rabeel
      Rabeel about 14 years
      I beleive it's to do with automatic integer promotion, someone with more facts than belief will post a factual answer
    • legends2k
      legends2k about 14 years
      @Roger: He is asking about the difference between C and C++ sizeof('a'), while I asked if there is a conversion happening? See the question body. I've already deduced that 'a' is an integer in C.
    • Rabeel
      Rabeel about 14 years
      I have to thank "David Rodríguez - dribeas" for pointing out the link in my answer is incorrect. I'm deleting my answer. legends2k, the correct answer should go to Peter or Neil, in my humble opinion.
    • legends2k
      legends2k about 14 years
      I've changed the accepted answer now. Thanks for correcting David Rodríguez - dribeas.
    • Alok Singhal
      Alok Singhal about 14 years
      You got your answer, but a comment: you can't print size_t objects with "%d". Since sizeof yields size_t a size_t object, you should print it with "%zu" (C99) or cast it to unsigned long and print with "%lu" (C89).
    • Ciro Santilli OurBigBook.com
      Ciro Santilli OurBigBook.com almost 9 years
  • Rabeel
    Rabeel about 14 years
    + 1 for "but it is often bytes", I'm still chuckling :)
  • SmacL
    SmacL about 14 years
    The integer format referred to sizeof('a') not 'a' so I don't see how this argument holds.
  • lexu
    lexu about 14 years
    used to be an integer was 2 bytes .. the standard doesn't define that either.
  • Admin
    Admin about 14 years
    The C standard says a char literal is of type int - it has sizeof int and no promotion is involved.
  • heijp06
    heijp06 about 14 years
    Your answer seems to suggest that the C compiler inspects a format string used by a library function when compiling a program, are you sure that that is the case?
  • SF.
    SF. about 14 years
    What if it was scanf("%s\n",format) ; printf(format, sizeof(char), sizeof('a')); and you'd type "%d, %d\n" when prompted? In this case the compiler has no way of knowing the variable types a'priori and has to use the ellipsis operator blindly as like it is meant to?
  • legends2k
    legends2k about 14 years
    May I know the rationale behind the standard stating sizeof(char) should always be 1? Is it because of the ASCII table having 256 chars? What if in an implementation I need to have more than that, say unicode?
  • legends2k
    legends2k about 14 years
    I asked about the promotion/casting that happens between the two data types, while the discussion/answer doesn't answer this.
  • Admin
    Admin about 14 years
    @legends2K You asked "If size of a character is one, why does 'c' give me 4?" As this answer and the question I linked explain that 'a' has sizeof == 4, there is obviously no casting or promotion taking place.
  • legends2k
    legends2k about 14 years
    Well. there is a detailed form of the question, below it, which reads "is there an implicit typecasting happening, under the hood, from that 4 byte value to a 1 byte value when it's assigned to the char variable". This too is part of it, I believe.
  • Michael Speer
    Michael Speer about 14 years
    @Peter van der Heijden : you are correct, a format string and its specifiers have nothing to do with the types of the variables passed after them. gcc, will issue warnings if they don't line up, but it compiles with mismatched types just fine, under the assumption you know more than the compiler does. That said, the 'a' is in a sizeof and is not in an "integer context". The sizeof calls are returning size_t, which I believe is generally typedef'ed to an unsigned integer.
  • Richard Pennington
    Richard Pennington about 14 years
    sizeof(1) is always 1 because that's what it is. 1 can be one byte (of 8 bits for example) or 3 bytes or ..
  • David Rodríguez - dribeas
    David Rodríguez - dribeas about 14 years
    The standard defines the sizeof operator as returning the size in bytes, so it is not often, but rather always. In the second paragraph of 'The sizeof operator': 'The sizeof operator yields the size (in bytes) of its operand.'
  • AProgrammer
    AProgrammer about 14 years
    sizeof(char) is one byte because it is the definition of byte in the C standard. That byte may be 8 bits or more (can't be less in C), and may or may not be the smallest unit adressable by the computer (definition of byte common in computer architecture). A third common definition of byte is "the unit used for a character encoding" -- i.e 8bits for UTF-8 or ISO-8859-X, 16 bits for UTF-16. Quite often, all definitions agree and put the size of the byte to 8 bits. So often that a fourth definition of byte is "8 bits". When they don't agree, you have better to be clear which definition you use
  • Johannes Schaub - litb
    Johannes Schaub - litb about 14 years
    I always shudder when reading "implicitly cast" in SO posts. There is no implicit cast: A cast is always an explicit conversion. The C Standard says in 6.3: "Several operators convert operand values from one type to another automatically. This subclause specifies the result required from such an implicit conversion, as well as those that result from a cast operation (an explicit conversion).". You want to say "implicitly converted".
  • David Thornley
    David Thornley about 14 years
    @lexu: An int has to be at least 16 bits, whatever that comes to in bytes. Since sizeof() measures in 8-bit bytes on most modern computers, that typically means at least 2 bytes. An int is supposed to be a natural size, which means 2 bytes on the old 16-bit machines and 4 on the more modern ones.
  • legends2k
    legends2k about 14 years
    @litb: Thanks for correcting me; I've fixed it in the question from 'casting' to 'conversion'.
  • Vatine
    Vatine about 14 years
    sizeof() measures in (integer, I believe) multiples of CHAR_BITS. No more, no less. sizeof(char) == 1, by definition. The number of bits in another type can be foudn by multiplying sizeof(type) with CHAR_BITS. Of course, most (if not all) platforms will have CHAR_BITS being 8.
  • gnasher729
    gnasher729 about 10 years
    There is no promotion. In C, 'a' has type int. In most C implementations, 'a' is exactly the same as 97. In C++, 'a' has type char.
  • legends2k
    legends2k about 10 years
    +1 thanks for quoting the standard; I wonder why integer character constant was chosen over character constant.
  • Lightness Races in Orbit
    Lightness Races in Orbit almost 9 years
    It's always bytes. It may not be octets.