If I copy a float to another variable, will they be equal?
Solution 1
Besides the assert(NaN==NaN);
case pointed out by kmdreko, you can have situations with x87-math, when 80bit floats are temporarily stored to memory and later compared to values which are still stored inside a register.
Possible minimal example, which fails with gcc9.2 when compiled with -O2 -m32
:
#include <cassert>
int main(int argc, char**){
float x = 1.f/(argc+2);
volatile float y = x;
assert(x==y);
}
Godbolt Demo: https://godbolt.org/z/X-Xt4R
The volatile
can probably be omitted, if you manage to create sufficient register-pressure to have y
stored and reloaded from memory (but confuse the compiler enough, not to omit the comparison all-together).
See GCC FAQ reference:
Solution 2
It won't be true if x
is NaN
, since comparisons on NaN
are always false (yes, even NaN == NaN
). For all other cases (normal values, subnormal values, infinities, zeros) this assertion will be true.
The advice for avoiding ==
for floats applies to calculations due to floating point numbers being unable to express many results exactly when used in arithmetic expressions. Assignment is not a calculation and there's no reason that assignment would yield a different value than the original.
Extended-precision evaluation should be a non-issue if the standard is followed. From <cfloat>
inherited from C [5.2.4.2.2.8] (emphasis mine):
Except for assignment and cast (which remove all extra range and precision), the values of operations with floating operands and values subject to the usual arithmetic conversions and of floating constants are evaluated to a format whose range and precision may be greater than required by the type.
However, as the comments have pointed out, some cases with certain compilers, build-options, and targets could make this paradoxically false.
Solution 3
Yes, y
will assuredly take on the value of x
:
[expr.ass]/2
: In simple assignment (=), the object referred to by the left operand is modified ([defns.access]) by replacing its value with the result of the right operand.
There is no leeway for other values to be assigned.
(Others have already pointed out that an equivalence comparison ==
will nonetheless evaluate to false
for NaN values.)
The usual issue with floating-point ==
is that it's easy to not have quite the value you think you do. Here, we know that the two values, whatever they are, are the same.
Solution 4
Yes, in all cases (disregarding NaNs and x87 issues), this will be true.
If you do a memcmp
on them you will be able test for equality while being able to compare NaNs and sNaNs. This will also require the compiler to take the address of the variable which will coerce the value into a 32-bit float
instead of an 80-bit one. This will eliminate the x87 issues. The second assertion here is intended to fail to show that ==
will not compare NaNs as true:
#include <cmath>
#include <cassert>
#include <cstring>
int main(void)
{
float x = std::nan("");
float y = x;
assert(!std::memcmp(&y, &x, sizeof(float)));
assert(y == x);
return 0;
}
Note that if the NaNs have a different internal representation (i.e. differing mantissa), the memcmp
will not compare true.
Solution 5
In usual cases, it would evaluate to true. (or the assert statement won't do anything)
Edit:
By 'usual cases' I mean am excluding the aforementioned scenarios (such as NaN values and 80x87 floating point units) as pointed by other users.
Given the obsolesence of 8087 chips in today's context, the issue is rather isolated and for the question to be applicable in current state of floating-point architecture used, its true for all cases except for NaNs.
(reference about 8087 - https://home.deec.uc.pt/~jlobo/tc/artofasm/ch14/ch143.htm)
Kudos to @chtz for reproducing a good example and @kmdreko for mentioning NaNs - didn't knew about them before!
Related videos on Youtube
Wei Li
Updated on June 08, 2022Comments
-
Wei Li almost 2 years
I know that using
==
to check equality of floating-point variables is not a good way. But I just want to know that with the following statements:float x = ... float y = x; assert(y == x)
Since
y
is copied fromx
, will the assertion be true?-
Thomas Weller over 4 yearsLet me provide a bounty of 50 to someone who actually proves inequality by a demonstration with real code. I want to see the 80 vs 64 bit thing in action. Plus another 50 for an explanation of the generated assembler code that shows one variable being in a register and the other not (or whatever the reason for the inequality might be, I'd like it explained on a low level).
-
pjc50 over 4 years@ThomasWeller the GCC bug about this: gcc.gnu.org/bugzilla/show_bug.cgi?id=323 ; however, I've just tried to repro it on an x86-64 system and it doesn't, even with -ffast-math. I suspect you need an old GCC on a 32-bit system.
-
MSalters over 4 years@pjc50: Actually you need an 80-bit system to reproduce bug 323; it's the 80x87 FPU which caused the problem. x86-64 uses the SSE FPU. The extra bits cause the problem, because they're rounded when spilling a value to a 32 bits float.
-
Cody Gray over 4 yearsIf MSalters's theory is correct (and I suspect it is), then you can repro either by compiling for 32-bit (
-m32
), or by instructing GCC to use the x87 FPU (-mfpmath=387
). -
Hot Licks over 4 yearsThere is the problem that, on some mythical hardware, the float value could be converted from 32 bit in storage to 48 bit in a register, then back to 32 when stored. This could result in a very small change in the bit value, especially if doing this "normalized" a value which was not initially normalized.
-
Cody Gray over 4 yearsChange "48 bit" to "80 bit", and then you can remove the "mythical" adjective there, @Hot. That's precisely what was being discussed immediately before your comment. The x87 (FPU for x86 architecture) uses 80-bit registers, an "extended-precision" format.
-
supercat over 4 years@CodyGray: An implementation targeting e.g. a 32-bit ARM could benefit from performing
float
calculations using a 64-bit "extended float" with 32-bit sign+exponent and 32-bit significand. Values of typefloat
will need to be converted to such a format when doing computations upon them, and given an expression likea+b+c
, keepinga+b
in not-necessarily-normalized unpacked format without normalizing it before addingc
would be faster than converting the intermediate result back tofloat
.
-
-
David Schwartz over 4 yearsI thought it was entirely possible for
x
to be in a floating point register whiley
is loaded from memory. Memory might have less precision than a register, causing the comparison to fail. -
David Schwartz over 4 yearsWhat if
x
is computed in a register in the first line, keeping more precision than the minimum for afloat
. They = x
may be in memory, keeping onlyfloat
precision. Then the test for equality would be done with the memory against the register, with different precisions, and thus no guarantee. -
Yakk - Adam Nevraumont over 4 years
x+pow(b,2)==x+pow(a,3)
could differ fromauto one=x+pow(b,2); auto two=y+pow(a,3); one==two
because one might compare using more precision than the other (if one/two are 64 bit values in ram, while intermediste values are 80ish bits on fpu). So assignment can do something, sometimes. -
Anirban166 over 4 yearsThat might be one case for a false, I haven't thought that far. (since the OP didn't provide any special cases, I am assuming no additional constraints)
-
David Schwartz over 4 yearsI don't really understand what you're saying. As I understand the question, the OP is asking if copying a float and then testing for equality is guaranteed to succeed. Your answer seems to be saying "yes". I'm asking why the answer isn't no.
-
Evg over 4 yearsWhat about compiler flags? It seems that they can break this rule. For example, MSVC documentation on
/fp:fast
reads: "The compiler may omit rounding at assignment statements, typecasts, or function calls." -
kmdreko over 4 years@evg Sure! My answer simply follows the standard. All bets are off if you tell your compiler to be non-confoming, especially when enabling fast-math.
-
Eric Postpischil over 4 yearsThe edit makes this answer incorrect. The C++ standard requires that assignment convert the value to the destination type—excess precision may be used in expression evaluations but may not be retained through assignment. It is immaterial whether the value is held in a register or memory; the C++ standard requires it be, as the code is written, a
float
value without extra precision. -
Eric Postpischil over 4 yearsRe “any extra precision truncated”: While the C standard says the extra range and precision are “removed”, this does not mean truncated. More commonly, round-to-nearest-ties-to-even is used.
-
Voo over 4 yearsThis answer is practically false, because compilers can and will compare floating point variables in registers to those spilled on the stack which will cause the comparison to be false. Whether that's a bug in the compiler or not I couldn't say, but the quoted standard section doesn't convince me that that behavior is not allowed.
-
AProgrammer over 4 years@EricPostpischil, What you wrote is true but I'd add that gcc has a well known bug (323 if my memory serve) which make it use extra precision when it shouldn't. I think that only x86 is affected (at least I'm pretty sure that x86_64 is not).
-
Lightness Races in Orbit over 4 years@Voo See the quote in my answer. The value of the RHS is assigned to the variable on the LHS. There is no legal justification for the resulting value of the LHS to differ from the value of the RHS. I appreciate that several compilers have bugs in this regard. But whether something's stored in a register is supposed to have nothing to do with it.
-
Lightness Races in Orbit over 4 years@David seems to be assuming that
=
is implemented in terms of some specific computer architecture, rather than mathematically. Again, I concede that some compilers buggily do that. But, where so, they are non-compliant. Remember, C++ is an abstraction, not a one-to-one mapping to CPU instructions. -
TripeHound over 4 years@AProgrammer Given that a(n extremely) buggy compiler could theoretically cause
int a=1; int b=a; assert( a==b );
to throw an assertion, I think it only makes sense to answer this question in relation to a correctly-functioning compiler (while possibly noting that some versions of some compilers do / have-been-known-to get this wrong). In practical terms, if for some reason a compiler doesn't remove the extra precision from the result of a register-stored assignment, it should do so before it uses that value. -
user253751 over 4 years@kmdreko Still worth mentioning in the answer, I think.
-
MSalters over 4 years@AProgrammer: 323 is indeed about the use of the 80 bits x87 FPU to hold extended-precision results. C++ allows that for temporaries, not for objects. It's apparently not fixed because the x87 has no hardware support for 64 bits operations, and the GCC people can't tell objects and temporaries apart where it's needed so they'd have to do software emulation of 64 bits precision everywhere.
-
Peter Cordes over 4 years@Voo: In ISO C++, rounding to type width is supposed to happen on any assignment. In most compilers that target x87, it really only happens when the compiler decides to spill / reload. You can force it with
gcc -ffloat-store
for strict compliance. But this question is aboutx=y; x==y;
without doing anything to either var in between. Ify
is already rounded to fit in a float, converting to double or long double and back won't change the value. ... -
Peter Cordes over 4 years... @Voo: If we're talking about real compilers (like gcc without -ffloat-store) I don't see a plausible mechanism for rounding one var but not the other. Assume
y
is in a register and has extra precision so it's not equal to anyfloat
. Either optimization is disabled and they'll both be spilled/reloaded, or it will see thatx
is still just another name fory
. Hmm, possibly withregister float y
and plainfloat x
. Either way, I'm really glad x87 is obsolete and not used for FP math anymore. -
Peter Cordes over 4 years@DavidSchwartz: See my previous 2 comments: converting a narrow float to a wider type can't change the value, as long as both are binary FP types. (Or both are decimal FP). The object-representation / bit-pattern can change, of course, but FP comparison is based on the value represented. i.e. a hardware instruction like
fcomi
that supports comparing an x87 register against a value from memory in a narrower format has to implicitly convert. The same conversion done twice will always produce the same result. (Even if you changed FP rounding modes: widening is always exact, no rounding) -
Thomas Weller over 4 yearsAny opinions against awarding this answer a bounty of 100? Let me know your concerns.
-
Nat over 4 yearsIt seems strange that the extra bits would be considered in comparing a
float
with standard precision to extra precision. -
Lightness Races in Orbit over 4 years@ThomasWeller That's a known bug in a consequently non-compliant implementation. Good to mention it though!
-
Lightness Races in Orbit over 4 years@Nat It is strange; this is a bug.
-
Lightness Races in Orbit over 4 years@ThomasWeller No, that's a reasonable award. Though I would like the answer to point out that this is non-compliant behaviour
-
chtz over 4 yearsI can extend this answer, pointing out what exactly happens in the assembly code, and that this actually violates the standard -- though I wouldn't call myself a language-lawyer, so I can't guarantee that there isn't an obscure clause which explicitly allows that behavior. I assume the OP was more interested in practical complications on actual compilers, not on completely bug-free, fully compliant compilers (which de-facto don't exist, I guess).
-
OrangeDog over 4 years@kmdreko GCC is by default non-conforming. The option
-ffloat-store
should ensure assignments always cause truncation. -
OrangeDog over 4 yearsWorth mentioning that
-ffloat-store
appears to be the way to prevent this. -
chtz over 4 years@OrangeDog I'll mention
-ffloat-store
, though I don't think this alone will make gcc fully compliant, e.g., even expressions likefloat x = (a+b)+c;
won't necessarily rounda+b
to a float before addingc
(I need to check that, though). Another example of gcc being non-compliant: With FMA enabled, gcc is also happy to implementx = a+b*c;
with a fused-multiply-addition, even though it (usually) produces different results than storingb*c
in a temporary before addinga
(clang does this only with certain-ffast-math
options). -
OrangeDog over 4 years@chtz according to the specification, rounding is only required for assignment and cast, so your first example is fine.
-
Lightness Races in Orbit over 4 yearsQuestion is tagged c++, not my-machine-here ;)
-
kmdreko over 4 yearsIs there a case where this could happen outside x87? The only other case I could see this happening with sse instructions is if the standard was ignored and extended-precision was forced, but I can't find the flags to do that, if possible.
-
chtz over 4 years@kmdreko This very issue should not happen with SSE-math (but might of course happen with any other extended-precision implementation on compilers with similar behavior). I don't feel confident to rule out any other problems.
-
supercat over 4 years@kmdreko: On many machines without an FPU, a floating-point type with a 32-bit significand could be processed more efficiently than one with a 24-bit significand, and one with a 64-bit significand could be processed more efficiently than a 53-bit one. Extended types weren't invented for the 8087.\
-
supercat over 4 years@PeterCordes: The problems with x87 stem from the fact that implementations used extended-precision math inconsistently and unpredictably. The pattern of converting values to extended precision, performing computations on them, and then rounding them to lower-precision formats at well defined times can on many platforms be faster and yield more accurate results than trying to round at every stage of computation. Such problems stemmed in large part from the unfortunate way the Standard opted to handle variadic arguments of its new
long double
type. -
Eric Towers over 4 yearsAt first, I thought that language lawyering the distinction between "value" and "result" would be perverse, but this distinction is not required to be without difference by the language of C2.2, 7.1.6; C3.3, 7.1.6; C4.2, 7.1.6, or C5.3, 7.1.6 of the draft Standard you cite.
-
Peter Cordes over 4 years@KonradRudolph: yes, it's clear in the standard but many widely-used compilers (when targeting x87 or other wide-FP-temporary ISAs) do not comply with the letter of the standard for this; see my replies to @ Voo above. So a "practical" answer that addresses the issue for real compilers is important! In this case I think it's implausible for
x == y
to be false wheny == y
would be true (i.e. for it to be different from!isnan(y)
) even in practice with gcc without-ffloat-store
. Anything that didn't work this way would one step farther into insanity than current compilers. -
Thomas Weller over 4 years@LightnessRacesBY-SA3.0: that linked bug seems very old but unfixed. The status is "suspended" and there are comments that it "will not" be fixed. Thus, an
assert(a==b)
is never a good idea for floats. Did I get this right? -
Konrad Rudolph over 4 years@PeterCordes David’s comment seems to be deleted now but IIRC his question was explicitly challenging Lightness on the standard, which is what my comment refers to.
-
Lightness Races in Orbit over 4 years@EricTowers Sorry can you clarify those references? I'm not finding what you're pointing to
-
Eric Towers over 4 years
-
Lightness Races in Orbit over 4 years@EricTowers Yeah, still not following you. Your first link goes to the Appendix C index (doesn't tell me anything). Your next four links all go to
[expr]
. If I'm to ignore the links and focus on the citations, I'm left with the confusion that e.g. C.5.3 doesn't seem to address the use of the term "value" or the term "result" (though it does use "result" once in its normal English context). Perhaps you could more clearly describe where you think the standard makes a distinction, and provide a single clear citation to this happening. Thanks! -
Eric Towers over 4 years@LightnessRacesBY-SA3.0 : Ah. I see. The ToC of this document is intentionally confusing. Try 7.1.6, starting with the text "The values of the floating-point operands and the results of floating-point expressions...".
-
Lightness Races in Orbit over 4 years@EricTowers Aha there we go! Hmm not sure that's really expressing a difference between the two concepts in a way that should affect our answer here tbh, unless there is wording elsewhere stating that when the result of an operation is assigned to an object, the resulting value can differ from said result. And I doubt that's the case.
-
Eric Towers over 4 yearsAs I said in my original comment: language lawyering that these two things are permitted different would be perverse, except that the parallel construction in the cited section is for two distinct things not two varieties of the same thing. Value is defined in 6.8.4 and result in 7.2.5. I don't see that a result must be an element of the same implementation-defined set of values as used to define the value of the relevant type.
-
Nat over 4 yearsI wonder if
<=
or>=
are broken, too? I mean, most of the time, folks know that they can't really rely on==
to logically assess if two floating-point values are equivalent in a mathematical sense, so that==
is broken on a type used primarily in numerics isn't as bad as it could've been with, e.g., integer types. But if this also breaks, say, the assumption that a non-NaN value being<=
implies that it's>
, then that'd seem like a major mess. -
David Hammen about 4 yearsRe
since comparisons on NaN are always false
-- That is not quite correct. Comparing whether any value is not equal to NaN, or whether NaN is not equal to any value is always true, even if the other value is itself a NaN.