Does cast between signed and unsigned int maintain exact bit pattern of variable in memory?
Solution 1
In general, casting in C is specified in terms of values, not bit patterns - the former will be preserved (if possible), but the latter not necessarily so. In case of two's complement representations without padding - which is mandatory for the fixed-with integer types - this distinction does not matter and the cast will indeed be a noop.
But even if the conversion from signed to unsigned would have changed the bit pattern, converting it back again would have restored the original value - with the caveat that out-of-range unsigned to signed conversion is implementation-defined and may raise a signal on overflow.
For full portability (which will probably be overkill), you'll need to use type punning instead of conversion. This can be done in one of two ways:
Via pointer casts, ie
uint32_t u = *(uint32_t*)&x;
which you should be careful with as it may violate effective typing rules (but is fine for signed/unsigned variants of integer types) or via unions, ie
uint32_t u = ((union { int32_t i; uint32_t u; }){ .i = x }).u;
which can also be used to eg convert from double
to uint64_t
, which you may not do with pointer casts if you want to avoid undefined behaviour.
Solution 2
Casts are used in C to mean both "type conversion" and "type disambiguation". If you have something like
(float) 3
Then it's a type conversion, and the actual bits change. If you say
(float) 3.0
it's a type disambiguation.
Assuming a 2's complement representation (see comments below), when you cast an int
to unsigned int
, the bit pattern is not changed, only its semantical meaning; if you cast it back, the result will always be correct. It falls into the case of type disambiguation because no bits are changed, only the way that the computer interprets them.
Note that, in theory, 2's complement may not be used, and unsigned
and signed
can have very different representations, and the actual bit pattern can change in that case.
However, from C11 (the current C standard), you actually are guaranteed that sizeof(int) == sizeof(unsigned int)
:
(§6.2.5/6) For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements [...]
I would say that in practice, you can assume it is safe.
Solution 3
This should always be safe, because the intXX_t
types are guaranteed to be in two's complement if they exist:
7.20.1.1 Exact-width integer types The typedef name intN_t designates a signed integer type with width N , no padding bits, and a two’s complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.
Theoretically, the back-conversion from uint32_t
to int32_t
is implementation defined, as for all unsigned
to signed
conversions. But I can't much imagine that a platform would do differently than what you expect.
If you want to be really sure of this you still could to that conversion manually. You'd just have to test a value for > INT32_MAX
and then do a little bit of math. Even if you do that systematically, a decent compiler should be able to detect that and optimize it out.
Flash
Updated on October 22, 2020Comments
-
Flash over 3 years
I want to pass a 32-bit signed integer
x
through a socket. In order that the receiver knows which byte order to expect, I am callinghtonl(x)
before sending.htonl
expects auint32_t
though and I want to be sure of what happens when I cast myint32_t
to auint32_t
.int32_t x = something; uint32_t u = (uint32_t) x;
Is it always the case that the bytes in
x
andu
each will be exactly the same? What about casting back:uint32_t u = something; int32_t x = (int32_t) u;
I realise that negative values cast to large unsigned values but that doesn't matter since I'm just casting back on the other end. However if the cast messes with the actual bytes then I can't be sure casting back will return the same value.
-
Christoph over 10 yearsthis is incorrect - signed to unsigned conversion may change bit patterns - it just so happens that it does not in case of two's complement representation with identical padding
-
Flash over 10 yearsGreat, this is what I wanted to know. So it will always work provided signed ints are represented using two's complement and this is pretty much always the case. Is that correct?
-
unwind over 10 yearsThe initial example is wrong, since
3.0
is adouble
, the expression(float) 3.0
most certainly is a type conversion, too. -
Christoph over 10 years@Andrew: correct - and even on non-two's complement hardware, the compiler would have to either fake it for the fixed-width integer types or not provide them at all; raising a signal on overflow should also not be a problem in practice
-
Filipe Gonçalves over 10 years@unwind According to expert c programming - deep C secrets (page 223), it's a type disambiguation because the compiler can plant the correct bits in the first place. It's like 3.0f
-
Christoph over 10 yearswell, I can imagine that a platform that traps integer overflow might be convinced to do so on type conversion (eg the compiler might generate a useless
add 0
instruction to trigger it); I'd be very surprised if there's a compiler that actually does so by default (or at all); also, I'd rather go for type punning than checks forINT32_MAX
- the fixed-with integers do not come with trap representations, so as far as the C standard is concerned, it's as safe as it gets and actually captures the progarmmers intent -
Filipe Gonçalves over 10 yearsHad I used
double x = 3.0; float y; y = (float) x;
, then it would certainly be a type conversion -
Zulan about 8 yearsDoes the pointer casting example not violate the strict aliasing rule?
int32_t
anduint32_t
are incompatible. Are they not? -
Christoph about 8 years@Zulan: signed and unsigned versions of a type may alias (cf C11 section 6.5 §7)
-
Cecil Ward over 7 yearsWhich dialect of C or C++, or which standard, is required to be able to use the syntax used in the union example?
-
Christoph over 7 years@CecilWard: C99 due to the use of compound literals; you could use a temporary variable instead to get C90 and C++ compatibility
-
Eric over 6 yearsThat book doesn't seem correct to me, or at least its distinction not meaningfull. The compiler can plant the correct bits for
(float) 3
as well, as it performs constant folding. -
polynomial_donut about 6 years"which you should be careful with as it may violate effective typing rules" can you elaborate on this?