Problems casting NAN floats to int

c floating-point nan

11,782

Solution 1

The result of a cast of a floating point number to an integer is undefined/unspecified for values not in the range of the integer variable (±1 for truncation).

Clause 6.3.1.4:

When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.

If the implementation defines __STDC_IEC_559__, then for conversions from a floating-point type to an integer type other than _BOOL:

if the floating value is infinite or NaN or if the integral part of the floating value exceeds the range of the integer type, then the "invalid" floating- point exception is raised and the resulting value is unspecified.

(Annex F [normative], point 4.)

If the implementation doesn't define __STDC_IEC_559__, then all bets are off.

Solution 2

There is a reason for this behavior, but it is not something you should usually rely on.

As you note, IEEE-754 does not specify what happens when you convert a floating-point NaN to an integer, except that it should raise an invalid operation exception, which your compiler probably ignores. The C standard says the behavior is undefined, which means not only do you not know what integer result you will get, you do not know what your program will do at all; the standard allows the program to abort or get crazy results or do anything. You probably executed this program on an Intel processor, and your compiler probably did the conversion using one of the built-in instructions. Intel specifies instruction behavior very carefully, and the behavior for converting a floating-point NaN to a 32-bit integer is to return 0x80000000, regardless of the payload of the NaN, which is what you observed.

Because Intel specifies the instruction behavior, you can rely on it if you know the instruction used. However, since the compiler does not provide such guarantees to you, you cannot rely on this instruction being used.

Solution 3

First, a NAN is everything not considered a float number according to the IEEE standard. So it can be several things. In the compiler I work with there is NAN and -NAN, so it's not about only one value.

Second, every compiler has its isnan set of functions to test for this case, so the programmer doesn't have to deal with the bits himself. To summarize, I don't think peeking at the value makes any difference. You might peek the value to see its IEEE construction, like sign, mantissa and exponent, but, again, each compiler gives its own functions (or better say, library) to deal with it.

I do have more to say about your testing, however.

float h = NAN;
printf("%x %d\n", (int)h, (int)h);

The casting you did trucates the float for converting it to an int. If you want to get the integer represented by the float, do the following

printf("%x %d\n", *(int *)&h, *(int *)&h);

That is, you take the address of the float, then refer to it as a pointer to int, and eventually take the int value. This way the bit representation is preserved.

11,782

Author by

Chris

Updated on June 03, 2022

Comments

Chris almost 2 years
Ignoring why I would want to do this, the 754 IEEE fp standard doesn't define the behavior for the following:
```
float h = NAN;
printf("%x %d\n", (int)h, (int)h);

Gives: 80000000 -2147483648
```
Basically, regardless of what value of NAN I give, it outputs 80000000 (hex) or -2147483648 (dec). Is there a reason for this and/or is this correct behavior? If so, how come?

The way I'm giving it different values of NaN are here: How can I manually set the bit value of a float that equates to NaN?

So basically, are there cases where the payload of the NaN affects the output of the cast?

Thanks!
Chris about 12 years

Given the fact that the behavior is undefined, is the result I got the common one for this undefined behavior? i.e. is anyone aware of a system were I would get different behavior than this? The 754 spec says the behavior of NaN operations is that the payload should be carried through.
Daniel Fischer about 12 years

I'm not aware of an implementation that does otherwise, but I'm not familiar with anything beyond a bit of gcc. gcc produces INT_MIN for all out-of-range conversions to int, as far as I know (but that's also only very little).
R.. GitHub STOP HELPING ICE about 12 years

I'm pretty sure you mean gcc on x86. There's no reason to assume the result should be the same everywhere else; this is likely an artifact of the fpu's behavior.
Daniel Fischer about 12 years

Oh, um, blush, I sure do. But of course, non-x86 hardware is a myth invented by apple to sell more Macs. (Thanks for the correction, @R..)
R.. GitHub STOP HELPING ICE about 12 years

I thought it was a myth invented by Google to sell phones. ;-)
John Zwinck over 9 years

It may be true that Intel processors convert NAN to a 32-bit int as 0x80000000, but this won't help you if your NAN is a constant value as determined by your compiler. In such cases you may see values other than INT_MIN, because the conversion is done at compile time rather than runtime, so Intel's x86 semantics never come into play. For example, when GCC converts NAN to int at compile time, it gives 0.
John Zwinck over 9 years

GCC itself will give you 0 when converting NAN to int at compile-time (rather than run-time, where you get INT_MIN). So even on a single platform you can get two different values, depending on whether the compiler was able to determine your NAN as a constant value.
hukeping over 4 years

hi @Israel printf("%x %d\n", *(int *)&h, *(int *)&h); this is a good way to get the bit representation from an address, is there anyway to write back the bit representation to an address? say write 0x7ff8000000000000 to &h?
vinc17 over 4 years

The result is not undefined, but one gets an unspecified value (see ISO C17, F.4). With GCC 4.6 to the trunk (10.0.0 snapshot), under Linux/x86_64 (Debian/unstable), I get for conversions from volatile double NAN to int, unsigned int, long, unsigned long: INT_MIN, 0, LONG_MIN, LONG_MAX+1 respectively (the last two values have the same representation, but this is not the case for the int and unsigned int results).
Daniel Fischer over 4 years

@vinc17 Thanks for the heads-up. Turns out that was already the case in C11, but I hadn't read the annex and went by just what was stated in 6.3.1.4.
Peter Cordes about 3 years

felixcloutier.com/x86/cvttsd2si is the instruction in question. x86's "integer indefinite" value is MSB=1, rest = 0, i.e. INT_MIN or INT64_MIN. As you say, a different usage of the instruction can have different results, e.g. float -> uint32_t on x86-64 will often convert to int64_t and take the low half because that's basically free in asm, and x86 (before AVX-512) doesn't provide FP -> unsigned conversions directly. (C doesn't define the behaviour of negative FP -> unsigned; the modular reduction is only for wide integral type -> unsigned).
Peter Cordes about 3 years

As you say, other ISAs can be different, e.g. unsigned conversion in C works as expected on x86 but not ARM
pmor about 2 years

To remind: "defines __STDC_IEC_559__" does not mean that the implementation conforms to the specifications in the annex F. It means that it may (or has an intent to) conform to the specifications in the annex F.