C++ difference between unsigned int and unsigned long int

11,492

Solution 1

The short answer is that yes, sizeof(unsigned) is not guaranteed to be equal to sizeof(unsigned long) but it does happen to be in MSVC. If you need to know the size of the integer for sure in a platform independent way, then use the types in <cstdint>: uint32_t and uint64_t. Avoid long in portable code, it can lead to a lot of headaches.

Also note that overflowing an unsigned integer is not undefined behaviour. The defined behaviour is that the value wraps around. (e.g. std::numeric_limits< uint64_t >::max() + 43 == 42). That's not true for signed types where overflow is undefined behaviour.

To answer your final question, a 64-bit integer can store values in the range [0, 264 - 1]. 232 + 1 will not cause a wrap around.

Solution 2

In the C++ standard, unsigned int is only guaranteed to be able to represent values from 0 to 65535. Practically, this corresponds to 16 bit. Implementations (i.e. compilers) may provide a unsigned int with a larger range, but are not required to.

In comparison, unsigned long int is guaranteed to be able to represent values in the range 0 to 4294967295. Practically, this corresponds to 32 bit. Again, an implementation may support a larger range.

The ranges and sizes of these types are implementation defined. The implementation must meet the minimum requirements, and document which it chooses.

It is not uncommon for implementations to provide unsigned int and unsigned long int that are both 32-bit. Or, for one or both to be 64 bit. As you've noted, the documentation for Visual Studio 2015 states that both unsigned int and unsigned long int represent the values 0 to 4294967295.

For a compiler like g++, which can target a range of systems, the choices are often determined by the target system. So a build of g++ for a 32-bit system may support different ranges for unsigned long than a build for a 64-bit system.

To answer your questions

For unsigned long 64 bits, is going beyond ceiling 4,294,967,295 an undefined behavior or correct behavior?

It has well defined behaviour although it may not produce the results you expect. Unsigned integral types in C++ use modulo arithmetic. An integral value (of a type that can support a larger range) that is outside the range 0 to 4294967295 will be converted so it is in that range (mathematically equivalent to repeatedly adding or subtracting 4294967296 [note the last digit]). In other words, it will "wrap around".

If I have an application working on Windows system compiled in Visual Studio, basicall unsigned == unsigned long. True or False?

Assuming Visual Studio 2015 it is true, as the documentation you have linked to says. It may or may not be true for future Microsoft products - that is an implementation decision.

If I have an application compiled by GNU compiler working on Linux/Windows I have to make sure whether unsigned long == unsigned int or unsigned long == unsigned long long to avoid data overflow. True or False

Actually, false.

It is only true if you are porting code that relies on those assumptions being true.

There are techniques you can use so your code does not rely on such assumptions being true. For example, you can safely detect when an operation involving two unsigned values overflows or underflows, and take actions to produce required results anyway. If all operations are checked appropriately, you can avoid reliance on particular sizes of types.

If I have a cross platform application that might be compiled by all of these Visual Studio/GNU/Clang/Intel compiler I have to clearly classify the environment with bunch of preprocessors to avoid data overflow. True or False

Not strictly true. Practically, often true.

If your code sticks to the realms of standard C++, there are techniques to avoid doing that (like I alluded to above, with being able to avoid reliance on sizes of unsigned types).

If your code uses, or provides wrappers for, functions that are outside standard C++ (e.g. the windows API, posix functions that are not in the C++ standard) then it will often be necessary. Even in those cases, it is possible to avoid such things. For example, place versions of function that use the windows API in a different source than versions that use posix. And configure the build process (makefile, etc) - for example, if building for windows, don't compile or link in the unix versions.

Share:
11,492
yc2986
Author by

yc2986

Updated on July 12, 2022

Comments

  • yc2986
    yc2986 almost 2 years

    I am doing C++ development on Windows with Visual Studio Compiler, specifically Visual Studio 2015 Update 3.

    For some DSP related work I am using unsigned int/unsigned long data type. I was wondering what is the difference between these two built in C/C++ type.

    I searched through Google and SO for a little bit and found these references.

    1. Types documentation on cppreference.com
    2. Types documentation on MSDN for Visual Studio 2015
    3. Types documentation for GNU C/C++ (as G++ compiler stated that C/C++ use same default type implementation, I refer to C doc here)

    I assume the cppreference documentation is the summarize for ISO C++11 standard. So from the "standard" unsigned and unsigned int are 16/32 bits depending on LP/ILP 32/64 data model while unsigned long and unsigned long int are 32/64 bits depending on LP/ILP 32/64 data model.

    For MSDN and GNU documentation they all stated that unsigned int/unsigned long are using 32 bits implementation and can hold value up to 4,294,967,295. However GNU documentation also stated that according to your system unsigned long could be 64 bits where it is the same as unsigned long long int.

    So my question are as follows:

    1. For unsigned long 64 bits, is going beyond ceiling 4,294,967,295 an undefined behavior or correct behavior?
    2. If I have an application working on Windows system compiled in Visual Studio, basicall unsigned == unsigned long. True or False?
    3. If I have an application compiled by GNU compiler working on Linux/Windows I have to make sure whether unsigned long == unsigned int or unsigned long == unsigned long long to avoid data overflow. True or False
    4. If I have a cross platform application that might be compiled by all of these Visual Studio/GNU/Clang/Intel compiler I have to clearly classify the environment with bunch of preprocessors to avoid data overflow. True or False

    Thanks in advance.

    Edit: Thanks for @PeterRuderman pointing out that going beyond ceil value for unsigned type is not undefined behavior.

    Then my question 1 will change to:

    1. For unsigned long 64 bits, will going beyond ceiling 4,294,967,295 cause itself to wrap around?