If I copy a float to another variable, will they be equal?

c++ floating-point

13,487

Solution 1

Besides the assert(NaN==NaN); case pointed out by kmdreko, you can have situations with x87-math, when 80bit floats are temporarily stored to memory and later compared to values which are still stored inside a register.

Possible minimal example, which fails with gcc9.2 when compiled with -O2 -m32:

#include <cassert>

int main(int argc, char**){
    float x = 1.f/(argc+2);
    volatile float y = x;
    assert(x==y);
}

Godbolt Demo: https://godbolt.org/z/X-Xt4R

The volatile can probably be omitted, if you manage to create sufficient register-pressure to have y stored and reloaded from memory (but confuse the compiler enough, not to omit the comparison all-together).

See GCC FAQ reference:

Why floating-point results change with optimization levels or different compiler versions or different target architectures?

Solution 2

It won't be true if x is NaN, since comparisons on NaN are always false (yes, even NaN == NaN). For all other cases (normal values, subnormal values, infinities, zeros) this assertion will be true.

The advice for avoiding == for floats applies to calculations due to floating point numbers being unable to express many results exactly when used in arithmetic expressions. Assignment is not a calculation and there's no reason that assignment would yield a different value than the original.

Extended-precision evaluation should be a non-issue if the standard is followed. From <cfloat> inherited from C [5.2.4.2.2.8] (emphasis mine):

Except for assignment and cast (which remove all extra range and precision), the values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.

However, as the comments have pointed out, some cases with certain compilers, build-options, and targets could make this paradoxically false.

Solution 3

Yes, y will assuredly take on the value of x:

[expr.ass]/2: In simple assignment (=), the object referred to by the left operand is modified ([defns.access]) by replacing its value with the result of the right operand.

There is no leeway for other values to be assigned.

^{(Others have already pointed out that an equivalence comparison == will nonetheless evaluate to false for NaN values.)}

The usual issue with floating-point == is that it's easy to not have quite the value you think you do. Here, we know that the two values, whatever they are, are the same.

Solution 4

Yes, in all cases (disregarding NaNs and x87 issues), this will be true.

If you do a memcmp on them you will be able test for equality while being able to compare NaNs and sNaNs. This will also require the compiler to take the address of the variable which will coerce the value into a 32-bit float instead of an 80-bit one. This will eliminate the x87 issues. The second assertion here is intended to fail to show that == will not compare NaNs as true:

#include <cmath>
#include <cassert>
#include <cstring>

int main(void)
{
    float x = std::nan("");
    float y = x;
    assert(!std::memcmp(&y, &x, sizeof(float)));
    assert(y == x);
    return 0;
}

Note that if the NaNs have a different internal representation (i.e. differing mantissa), the memcmp will not compare true.

Solution 5

In usual cases, it would evaluate to true. (or the assert statement won't do anything)

Edit:

By 'usual cases' I mean am excluding the aforementioned scenarios (such as NaN values and 80x87 floating point units) as pointed by other users.

Given the obsolesence of 8087 chips in today's context, the issue is rather isolated and for the question to be applicable in current state of floating-point architecture used, its true for all cases except for NaNs.

(reference about 8087 - https://home.deec.uc.pt/~jlobo/tc/artofasm/ch14/ch143.htm)

Kudos to @chtz for reproducing a good example and @kmdreko for mentioning NaNs - didn't knew about them before!

View more solutions

13,487

Wei Li

Updated on June 08, 2022

Comments

Wei Li almost 2 years
I know that using == to check equality of floating-point variables is not a good way. But I just want to know that with the following statements:
```
float x = ...

float y = x;

assert(y == x)
```
Since y is copied from x, will the assertion be true?
- Thomas Weller over 4 years
  
  Let me provide a bounty of 50 to someone who actually proves inequality by a demonstration with real code. I want to see the 80 vs 64 bit thing in action. Plus another 50 for an explanation of the generated assembler code that shows one variable being in a register and the other not (or whatever the reason for the inequality might be, I'd like it explained on a low level).
- pjc50 over 4 years
  
  @ThomasWeller the GCC bug about this: gcc.gnu.org/bugzilla/show_bug.cgi?id=323 ; however, I've just tried to repro it on an x86-64 system and it doesn't, even with -ffast-math. I suspect you need an old GCC on a 32-bit system.
- MSalters over 4 years
  
  @pjc50: Actually you need an 80-bit system to reproduce bug 323; it's the 80x87 FPU which caused the problem. x86-64 uses the SSE FPU. The extra bits cause the problem, because they're rounded when spilling a value to a 32 bits float.
- Cody Gray over 4 years
  
  If MSalters's theory is correct (and I suspect it is), then you can repro either by compiling for 32-bit (-m32), or by instructing GCC to use the x87 FPU (-mfpmath=387).
- Hot Licks over 4 years
  
  There is the problem that, on some mythical hardware, the float value could be converted from 32 bit in storage to 48 bit in a register, then back to 32 when stored. This could result in a very small change in the bit value, especially if doing this "normalized" a value which was not initially normalized.
- Cody Gray over 4 years
  
  Change "48 bit" to "80 bit", and then you can remove the "mythical" adjective there, @Hot. That's precisely what was being discussed immediately before your comment. The x87 (FPU for x86 architecture) uses 80-bit registers, an "extended-precision" format.
- supercat over 4 years
  
  @CodyGray: An implementation targeting e.g. a 32-bit ARM could benefit from performing float calculations using a 64-bit "extended float" with 32-bit sign+exponent and 32-bit significand. Values of type float will need to be converted to such a format when doing computations upon them, and given an expression like a+b+c, keeping a+b in not-necessarily-normalized unpacked format without normalizing it before adding c would be faster than converting the intermediate result back to float.
David Schwartz over 4 years

I thought it was entirely possible for x to be in a floating point register while y is loaded from memory. Memory might have less precision than a register, causing the comparison to fail.
David Schwartz over 4 years

What if x is computed in a register in the first line, keeping more precision than the minimum for a float. The y = x may be in memory, keeping only float precision. Then the test for equality would be done with the memory against the register, with different precisions, and thus no guarantee.
Yakk - Adam Nevraumont over 4 years

x+pow(b,2)==x+pow(a,3) could differ from auto one=x+pow(b,2); auto two=y+pow(a,3); one==two because one might compare using more precision than the other (if one/two are 64 bit values in ram, while intermediste values are 80ish bits on fpu). So assignment can do something, sometimes.
Anirban166 over 4 years

That might be one case for a false, I haven't thought that far. (since the OP didn't provide any special cases, I am assuming no additional constraints)
David Schwartz over 4 years

I don't really understand what you're saying. As I understand the question, the OP is asking if copying a float and then testing for equality is guaranteed to succeed. Your answer seems to be saying "yes". I'm asking why the answer isn't no.
Evg over 4 years

What about compiler flags? It seems that they can break this rule. For example, MSVC documentation on /fp:fast reads: "The compiler may omit rounding at assignment statements, typecasts, or function calls."
kmdreko over 4 years

@evg Sure! My answer simply follows the standard. All bets are off if you tell your compiler to be non-confoming, especially when enabling fast-math.
Eric Postpischil over 4 years

The edit makes this answer incorrect. The C++ standard requires that assignment convert the value to the destination type—excess precision may be used in expression evaluations but may not be retained through assignment. It is immaterial whether the value is held in a register or memory; the C++ standard requires it be, as the code is written, a float value without extra precision.
Eric Postpischil over 4 years

Re “any extra precision truncated”: While the C standard says the extra range and precision are “removed”, this does not mean truncated. More commonly, round-to-nearest-ties-to-even is used.
Voo over 4 years

This answer is practically false, because compilers can and will compare floating point variables in registers to those spilled on the stack which will cause the comparison to be false. Whether that's a bug in the compiler or not I couldn't say, but the quoted standard section doesn't convince me that that behavior is not allowed.
AProgrammer over 4 years

@EricPostpischil, What you wrote is true but I'd add that gcc has a well known bug (323 if my memory serve) which make it use extra precision when it shouldn't. I think that only x86 is affected (at least I'm pretty sure that x86_64 is not).
Lightness Races in Orbit over 4 years

@Voo See the quote in my answer. The value of the RHS is assigned to the variable on the LHS. There is no legal justification for the resulting value of the LHS to differ from the value of the RHS. I appreciate that several compilers have bugs in this regard. But whether something's stored in a register is supposed to have nothing to do with it.
Lightness Races in Orbit over 4 years

@David seems to be assuming that = is implemented in terms of some specific computer architecture, rather than mathematically. Again, I concede that some compilers buggily do that. But, where so, they are non-compliant. Remember, C++ is an abstraction, not a one-to-one mapping to CPU instructions.
TripeHound over 4 years

@AProgrammer Given that a(n extremely) buggy compiler could theoretically cause int a=1; int b=a; assert( a==b ); to throw an assertion, I think it only makes sense to answer this question in relation to a correctly-functioning compiler (while possibly noting that some versions of some compilers do / have-been-known-to get this wrong). In practical terms, if for some reason a compiler doesn't remove the extra precision from the result of a register-stored assignment, it should do so before it uses that value.
user253751 over 4 years

@kmdreko Still worth mentioning in the answer, I think.
MSalters over 4 years

@AProgrammer: 323 is indeed about the use of the 80 bits x87 FPU to hold extended-precision results. C++ allows that for temporaries, not for objects. It's apparently not fixed because the x87 has no hardware support for 64 bits operations, and the GCC people can't tell objects and temporaries apart where it's needed so they'd have to do software emulation of 64 bits precision everywhere.
Peter Cordes over 4 years

@Voo: In ISO C++, rounding to type width is supposed to happen on any assignment. In most compilers that target x87, it really only happens when the compiler decides to spill / reload. You can force it with gcc -ffloat-store for strict compliance. But this question is about x=y; x==y; without doing anything to either var in between. If y is already rounded to fit in a float, converting to double or long double and back won't change the value. ...
Peter Cordes over 4 years

... @Voo: If we're talking about real compilers (like gcc without -ffloat-store) I don't see a plausible mechanism for rounding one var but not the other. Assume y is in a register and has extra precision so it's not equal to any float. Either optimization is disabled and they'll both be spilled/reloaded, or it will see that x is still just another name for y. Hmm, possibly with register float y and plain float x. Either way, I'm really glad x87 is obsolete and not used for FP math anymore.
Peter Cordes over 4 years

@DavidSchwartz: See my previous 2 comments: converting a narrow float to a wider type can't change the value, as long as both are binary FP types. (Or both are decimal FP). The object-representation / bit-pattern can change, of course, but FP comparison is based on the value represented. i.e. a hardware instruction like fcomi that supports comparing an x87 register against a value from memory in a narrower format has to implicitly convert. The same conversion done twice will always produce the same result. (Even if you changed FP rounding modes: widening is always exact, no rounding)
Thomas Weller over 4 years

Any opinions against awarding this answer a bounty of 100? Let me know your concerns.
Nat over 4 years

It seems strange that the extra bits would be considered in comparing a float with standard precision to extra precision.
Lightness Races in Orbit over 4 years

@ThomasWeller That's a known bug in a consequently non-compliant implementation. Good to mention it though!
Lightness Races in Orbit over 4 years

@Nat It is strange; this is a bug.
Lightness Races in Orbit over 4 years

@ThomasWeller No, that's a reasonable award. Though I would like the answer to point out that this is non-compliant behaviour
chtz over 4 years

I can extend this answer, pointing out what exactly happens in the assembly code, and that this actually violates the standard -- though I wouldn't call myself a language-lawyer, so I can't guarantee that there isn't an obscure clause which explicitly allows that behavior. I assume the OP was more interested in practical complications on actual compilers, not on completely bug-free, fully compliant compilers (which de-facto don't exist, I guess).
OrangeDog over 4 years

@kmdreko GCC is by default non-conforming. The option -ffloat-store should ensure assignments always cause truncation.
OrangeDog over 4 years

Worth mentioning that -ffloat-store appears to be the way to prevent this.
chtz over 4 years

@OrangeDog I'll mention -ffloat-store, though I don't think this alone will make gcc fully compliant, e.g., even expressions like float x = (a+b)+c; won't necessarily round a+b to a float before adding c (I need to check that, though). Another example of gcc being non-compliant: With FMA enabled, gcc is also happy to implement x = a+b*c; with a fused-multiply-addition, even though it (usually) produces different results than storing b*c in a temporary before adding a (clang does this only with certain -ffast-math options).
OrangeDog over 4 years

@chtz according to the specification, rounding is only required for assignment and cast, so your first example is fine.
Lightness Races in Orbit over 4 years

Question is tagged c++, not my-machine-here ;)
kmdreko over 4 years

Is there a case where this could happen outside x87? The only other case I could see this happening with sse instructions is if the standard was ignored and extended-precision was forced, but I can't find the flags to do that, if possible.
chtz over 4 years

@kmdreko This very issue should not happen with SSE-math (but might of course happen with any other extended-precision implementation on compilers with similar behavior). I don't feel confident to rule out any other problems.
supercat over 4 years

@kmdreko: On many machines without an FPU, a floating-point type with a 32-bit significand could be processed more efficiently than one with a 24-bit significand, and one with a 64-bit significand could be processed more efficiently than a 53-bit one. Extended types weren't invented for the 8087.\
supercat over 4 years

@PeterCordes: The problems with x87 stem from the fact that implementations used extended-precision math inconsistently and unpredictably. The pattern of converting values to extended precision, performing computations on them, and then rounding them to lower-precision formats at well defined times can on many platforms be faster and yield more accurate results than trying to round at every stage of computation. Such problems stemmed in large part from the unfortunate way the Standard opted to handle variadic arguments of its new long double type.
Eric Towers over 4 years

At first, I thought that language lawyering the distinction between "value" and "result" would be perverse, but this distinction is not required to be without difference by the language of C2.2, 7.1.6; C3.3, 7.1.6; C4.2, 7.1.6, or C5.3, 7.1.6 of the draft Standard you cite.
Peter Cordes over 4 years

@KonradRudolph: yes, it's clear in the standard but many widely-used compilers (when targeting x87 or other wide-FP-temporary ISAs) do not comply with the letter of the standard for this; see my replies to @ Voo above. So a "practical" answer that addresses the issue for real compilers is important! In this case I think it's implausible for x == y to be false when y == y would be true (i.e. for it to be different from !isnan(y)) even in practice with gcc without -ffloat-store. Anything that didn't work this way would one step farther into insanity than current compilers.
Thomas Weller over 4 years

@LightnessRacesBY-SA3.0: that linked bug seems very old but unfixed. The status is "suspended" and there are comments that it "will not" be fixed. Thus, an assert(a==b) is never a good idea for floats. Did I get this right?
Konrad Rudolph over 4 years

@PeterCordes David’s comment seems to be deleted now but IIRC his question was explicitly challenging Lightness on the standard, which is what my comment refers to.
Lightness Races in Orbit over 4 years

@EricTowers Sorry can you clarify those references? I'm not finding what you're pointing to
Eric Towers over 4 years

@LightnessRacesBY-SA3.0 : C. C2.2, C3.3, C4.2, and C5.3.
Lightness Races in Orbit over 4 years

@EricTowers Yeah, still not following you. Your first link goes to the Appendix C index (doesn't tell me anything). Your next four links all go to [expr]. If I'm to ignore the links and focus on the citations, I'm left with the confusion that e.g. C.5.3 doesn't seem to address the use of the term "value" or the term "result" (though it does use "result" once in its normal English context). Perhaps you could more clearly describe where you think the standard makes a distinction, and provide a single clear citation to this happening. Thanks!
Eric Towers over 4 years

@LightnessRacesBY-SA3.0 : Ah. I see. The ToC of this document is intentionally confusing. Try 7.1.6, starting with the text "The values of the floating-point operands and the results of floating-point expressions...".
Lightness Races in Orbit over 4 years

@EricTowers Aha there we go! Hmm not sure that's really expressing a difference between the two concepts in a way that should affect our answer here tbh, unless there is wording elsewhere stating that when the result of an operation is assigned to an object, the resulting value can differ from said result. And I doubt that's the case.
Eric Towers over 4 years

As I said in my original comment: language lawyering that these two things are permitted different would be perverse, except that the parallel construction in the cited section is for two distinct things not two varieties of the same thing. Value is defined in 6.8.4 and result in 7.2.5. I don't see that a result must be an element of the same implementation-defined set of values as used to define the value of the relevant type.
Nat over 4 years

I wonder if <= or >= are broken, too? I mean, most of the time, folks know that they can't really rely on == to logically assess if two floating-point values are equivalent in a mathematical sense, so that == is broken on a type used primarily in numerics isn't as bad as it could've been with, e.g., integer types. But if this also breaks, say, the assumption that a non-NaN value being <= implies that it's >, then that'd seem like a major mess.
David Hammen about 4 years

Re since comparisons on NaN are always false -- That is not quite correct. Comparing whether any value is not equal to NaN, or whether NaN is not equal to any value is always true, even if the other value is itself a NaN.