Why is (18446744073709551615 == -1) true?

c++ equality unsigned signed integer-overflow

29,715

Solution 1

`18,446,744,073,709,551,615`

This number mentioned, 18,446,744,073,709,551,615, is actually 2^64 − 1. The important thing here is that 2^64-1 is essentially 0-based 2^64. The first digit of an unsigned integer is 0, not 1. So if the maximum value is 1, it has two possible values: 0, or 1 (2).

Let's look at 2^64 - 1 in 64bit binary, all the bits are on.

1111111111111111111111111111111111111111111111111111111111111111b

The `-1`

Let's look at +1 in 64bit binary.

0000000000000000000000000000000000000000000000000000000000000001b

To make it negative in One's Complement (OCP) we invert the bits.

1111111111111111111111111111111111111111111111111111111111111110b

Computers seldom use OCP, they use Two's Complement (TCP). To get TCP, you add one to OCP.

1111111111111111111111111111111111111111111111111111111111111110b (-1 in OCP)
+                                                              1b (1)
-----------------------------------------------------------------
1111111111111111111111111111111111111111111111111111111111111111b (-1 in TCP)

"But, wait" you ask, if in Twos Complement -1 is,

1111111111111111111111111111111111111111111111111111111111111111b

And, if in binary 2^64 - 1 is

1111111111111111111111111111111111111111111111111111111111111111b

Then they're equal! And, that's what you're seeing. You're comparing a signed 64 bit integer to an unsigned 64bit integer. In C++ that means convert the signed value to unsigned, which the compiler does.

Update

For a technical correction thanks to davmac in the comments, the conversion from -1 which is signed to an unsigned type of the same size is actually specified in the language, and not a function of the architecture. That all said, you may find the answer above useful for understanding the arch/languages that support two's compliment but lack the spec to ensure results you can depend on.

Solution 2

string::npos is defined as constexpr static std::string::size_type string::npos = -1; (or if it's defined inside the class definition that would be constexpr static size_type npos = -1; but that's really irrelevant).

The wraparound of negative numbers converted to unsigned types (std::string::size_type is basically std::size_t, which is unsigned) is perfectly well-defined by the Standard. -1 wraps to the largest representable value of the unsigned type, which in your case is 18446744073709551615. Note that the exact value is implementation-defined because the size of std::size_t is implementation-defined (but capable of holding the size of the largest possible array on the system in question).

Solution 3

According to the C++ Standard (Document Number: N3337 or Document Number: N4296) std::string::npos is defined the following way

static const size_type npos = -1;

where std::string::size_type is some unsigned integer type. So there is nothing wonderful that std::string::npos is equal to -1. The initializer is converted to the tyhpe of std::string::npos.

As for this equation

(string::npos == ULONG_MAX) is true,

then it means that the type std::string::npos has type in the used implementation unsigned long. This type is usually corresponds to the type size_t.

In this equation

(18446744073709551615 == -1)

The left literal has some unsigned integral type that is appropriate to store such a big literal. Thus the right operand is converted also to this unsigned type by propogating the sign bit. As the left operand represents itself the maximum value of the type then they are equal.

29,715

Bahadır

Updated on July 09, 2022

Comments

Bahadır almost 2 years
When I was working on string::npos I noticed something and I couldn't find any explanation for it on the web.
```
(string::npos == ULONG_MAX)
```
and
```
(string::npos == -1)
```
are true.

So I tried this:
```
(18446744073709551615 == -1)
```
which is also true.

How can it be possible? Is it because of binary conversation?
- Stargateur over 7 years
  
  overflow :p you compare a unsigned and a signed value
- rubenvb over 7 years
  
  This is not undefined behaviour.
- lelloman over 7 years
  
  18446744073709551615 = 2^64 -1 ... spooky coincidence?
- Edgar Rokjān over 7 years
  
  @George it is not UB.
- davmac over 7 years
  
  It is implementation defined behavior. The question is not really correct; it's not always true that (18446744073709551615 == -1) is true.
- Evan Carroll almost 6 years
  
  @davmac I don't think that's true, I think the result of unsigned overflow is defined. can you source that? stackoverflow.com/questions/5416414/…
- Evan Carroll almost 6 years
  
  Also @rubenvb ^
- Evan Carroll almost 6 years
  
  @Bahadır check out my answer for more information =)
- davmac almost 6 years
  
  @EvanCarroll Hmm. 18446744073709551615 is a decimal literal, without a suffix. Its type should be int, long int or long long int or a signed "extended integer type" (see Integer Literals [lex.icon]). Since "A program is ill-formed if one of its translation units contains an integer literal that cannot be represented by any of the allowed types", a program containing this expression is actually ill-formed unless it's long long int is large enough, in which case the expression should be false and not true. Typically compilers apply an unsigned type, hence implementation defined.
- davmac almost 6 years
  
  @EvanCarroll OTOH 18446744073709551615u == -1 is not ill-formed assuming that unsigned long long (or a suitable extended type) is large enough. It will only be true, though, if the size of the type it has is 64 bits - hence still implementation defined.
- Evan Carroll almost 6 years
  
  @davmac It's platform/arch-dependent, the spec defines the behavior. As are many operations in C and C++. The spec defines that assuming an unsigned long and signed long are 64bits the result should be true because unsigned overflow is allowed to occur.
- davmac almost 6 years
  
  @EvanCarroll some context has been lost here since an earlier comment (which seems to have been deleted) was claiming it was undefined behaviour, my comment was just to say that's not necessarily the case. Maybe you're right and "implementation defined" is the wrong terminology, but the point stands that the true/false value of the comparison depends on implementation details (the size of various int types). Also as I noted just above, the expression is really either ill-formed or false, it can't ever properly be true - but various real compilers allow it, with a true value.
rubenvb over 7 years

Negative numbers are not stored as 2s complement by definition/Standard. They only behave that way.
Baum mit Augen over 7 years

There is no signed overflow here.
doron over 7 years

This is CPU architecture dependent but most modern architectues use 2s complement
rubenvb over 7 years

This is not CPU architecture dependent as it is explicitly defined by the C++ Standard.
davmac almost 6 years

Strictly speaking the bit representations of the two numbers before conversion being the same doesn't matter. Even with 1's complement or signed magnitude representations, the conversion of (signed) -1 to unsigned long will always result in ULONG_MAX. (The bit pattern will be the same after conversion of course).
rubenvb almost 6 years

All the ones and zeroes in this answer are implementation defined (except the -1's). In C++ it also doesn't matter how these are represented in binary at all.
Red.Wave over 5 years

@davmac how come? ULONG_MAX==2^32-1 while ~1==2^32 -2. On a one's complement system -1==~1==(ULONG_MAX-1).
davmac over 5 years

@Red.Wave ~1==(ULONG_MAX-1) isn't correct. They may have the same representation with a one's complement representation but do not have the same value. Conversion of -1 to unsigned long is specified by the language to result in ULONG_MAX. Therefor if -1==~1 (as in one's complement) then ~1==ULONG_MAX, because that comparison applies usual arithmetic conversions which converts -1 to unsigned long (which yields ULONG_MAX).
Red.Wave over 5 years

@davmac any citations plz? A nop conversion would not result in that equality. Actually ULONG_MAX=-0=~0, assuming nop coversion on ones complement system. Setting -0=-1 would result in skews in logics and difficulty in implementation that defeats the purpose of one's complement.
davmac over 5 years

@Red.Wave assuming nop conversion isn't correct. C99 6.3.1.3, "Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type"
davmac over 5 years

@Red.Wave or, since this question is tagged C++ and not C, see stackoverflow.com/questions/2711522/…
davmac over 5 years

According to the C++ language standard (at least C++11 as per open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3690.pdf section 2.14.2) the left literal must have a signed type, not an unsigned type. Compilers which translate it to an unsigned type in cases where there is no suitable signed type appear to be doing so as an extension to the language.