What is the most effective way for float and double comparison?

c++ algorithm optimization floating-point

541,945

Solution 1

Be extremely careful using any of the other suggestions. It all depends on context.

I have spent a long time tracing a bugs in a system that presumed a==b if |a-b|<epsilon. The underlying problems were:

The implicit presumption in an algorithm that if a==b and b==c then a==c.
Using the same epsilon for lines measured in inches and lines measured in mils (.001 inch). That is a==b but 1000a!=1000b. (This is why AlmostEqual2sComplement asks for the epsilon or max ULPS).
The use of the same epsilon for both the cosine of angles and the length of lines!
Using such a compare function to sort items in a collection. (In this case using the builtin C++ operator == for doubles produced correct results.)

Like I said: it all depends on context and the expected size of a and b.

BTW, std::numeric_limits<double>::epsilon() is the "machine epsilon". It is the difference between 1.0 and the next value representable by a double. I guess that it could be used in the compare function but only if the expected values are less than 1. (This is in response to @cdv's answer...)

Also, if you basically have int arithmetic in doubles (here we use doubles to hold int values in certain cases) your arithmetic will be correct. For example 4.0/2.0 will be the same as 1.0+1.0. This is as long as you do not do things that result in fractions (4.0/3.0) or do not go outside of the size of an int.

Solution 2

The comparison with an epsilon value is what most people do (even in game programming).

You should change your implementation a little though:

bool AreSame(double a, double b)
{
    return fabs(a - b) < EPSILON;
}

Edit: Christer has added a stack of great info on this topic on a recent blog post. Enjoy.

Solution 3

Comparing floating point numbers for depends on the context. Since even changing the order of operations can produce different results, it is important to know how "equal" you want the numbers to be.

Comparing floating point numbers by Bruce Dawson is a good place to start when looking at floating point comparison.

The following definitions are from The art of computer programming by Knuth:

bool approximatelyEqual(float a, float b, float epsilon)
{
    return fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool essentiallyEqual(float a, float b, float epsilon)
{
    return fabs(a - b) <= ( (fabs(a) > fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool definitelyGreaterThan(float a, float b, float epsilon)
{
    return (a - b) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

bool definitelyLessThan(float a, float b, float epsilon)
{
    return (b - a) > ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon);
}

Of course, choosing epsilon depends on the context, and determines how equal you want the numbers to be.

Another method of comparing floating point numbers is to look at the ULP (units in last place) of the numbers. While not dealing specifically with comparisons, the paper What every computer scientist should know about floating point numbers is a good resource for understanding how floating point works and what the pitfalls are, including what ULP is.

Solution 4

I found that the Google C++ Testing Framework contains a nice cross-platform template-based implementation of AlmostEqual2sComplement which works on both doubles and floats. Given that it is released under the BSD license, using it in your own code should be no problem, as long as you retain the license. I extracted the below code from ~~http://code.google.com/p/googletest/source/browse/trunk/include/gtest/internal/gtest-internal.h~~ https://github.com/google/googletest/blob/master/googletest/include/gtest/internal/gtest-internal.h and added the license on top.

Be sure to #define GTEST_OS_WINDOWS to some value (or to change the code where it's used to something that fits your codebase - it's BSD licensed after all).

Usage example:

double left  = // something
double right = // something
const FloatingPoint<double> lhs(left), rhs(right);

if (lhs.AlmostEquals(rhs)) {
  //they're equal!
}

Here's the code:

// Copyright 2005, Google Inc.
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are
// met:
//
//     * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
//     * Redistributions in binary form must reproduce the above
// copyright notice, this list of conditions and the following disclaimer
// in the documentation and/or other materials provided with the
// distribution.
//     * Neither the name of Google Inc. nor the names of its
// contributors may be used to endorse or promote products derived from
// this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
// "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
// LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
// A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
// OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
// SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
// LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
// DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
// THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
// (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
//
// Authors: [email protected] (Zhanyong Wan), [email protected] (Sean Mcafee)
//
// The Google C++ Testing Framework (Google Test)


// This template class serves as a compile-time function from size to
// type.  It maps a size in bytes to a primitive type with that
// size. e.g.
//
//   TypeWithSize<4>::UInt
//
// is typedef-ed to be unsigned int (unsigned integer made up of 4
// bytes).
//
// Such functionality should belong to STL, but I cannot find it
// there.
//
// Google Test uses this class in the implementation of floating-point
// comparison.
//
// For now it only handles UInt (unsigned int) as that's all Google Test
// needs.  Other types can be easily added in the future if need
// arises.
template <size_t size>
class TypeWithSize {
 public:
  // This prevents the user from using TypeWithSize<N> with incorrect
  // values of N.
  typedef void UInt;
};

// The specialization for size 4.
template <>
class TypeWithSize<4> {
 public:
  // unsigned int has size 4 in both gcc and MSVC.
  //
  // As base/basictypes.h doesn't compile on Windows, we cannot use
  // uint32, uint64, and etc here.
  typedef int Int;
  typedef unsigned int UInt;
};

// The specialization for size 8.
template <>
class TypeWithSize<8> {
 public:
#if GTEST_OS_WINDOWS
  typedef __int64 Int;
  typedef unsigned __int64 UInt;
#else
  typedef long long Int;  // NOLINT
  typedef unsigned long long UInt;  // NOLINT
#endif  // GTEST_OS_WINDOWS
};


// This template class represents an IEEE floating-point number
// (either single-precision or double-precision, depending on the
// template parameters).
//
// The purpose of this class is to do more sophisticated number
// comparison.  (Due to round-off error, etc, it's very unlikely that
// two floating-points will be equal exactly.  Hence a naive
// comparison by the == operation often doesn't work.)
//
// Format of IEEE floating-point:
//
//   The most-significant bit being the leftmost, an IEEE
//   floating-point looks like
//
//     sign_bit exponent_bits fraction_bits
//
//   Here, sign_bit is a single bit that designates the sign of the
//   number.
//
//   For float, there are 8 exponent bits and 23 fraction bits.
//
//   For double, there are 11 exponent bits and 52 fraction bits.
//
//   More details can be found at
//   http://en.wikipedia.org/wiki/IEEE_floating-point_standard.
//
// Template parameter:
//
//   RawType: the raw floating-point type (either float or double)
template <typename RawType>
class FloatingPoint {
 public:
  // Defines the unsigned integer type that has the same size as the
  // floating point number.
  typedef typename TypeWithSize<sizeof(RawType)>::UInt Bits;

  // Constants.

  // # of bits in a number.
  static const size_t kBitCount = 8*sizeof(RawType);

  // # of fraction bits in a number.
  static const size_t kFractionBitCount =
    std::numeric_limits<RawType>::digits - 1;

  // # of exponent bits in a number.
  static const size_t kExponentBitCount = kBitCount - 1 - kFractionBitCount;

  // The mask for the sign bit.
  static const Bits kSignBitMask = static_cast<Bits>(1) << (kBitCount - 1);

  // The mask for the fraction bits.
  static const Bits kFractionBitMask =
    ~static_cast<Bits>(0) >> (kExponentBitCount + 1);

  // The mask for the exponent bits.
  static const Bits kExponentBitMask = ~(kSignBitMask | kFractionBitMask);

  // How many ULP's (Units in the Last Place) we want to tolerate when
  // comparing two numbers.  The larger the value, the more error we
  // allow.  A 0 value means that two numbers must be exactly the same
  // to be considered equal.
  //
  // The maximum error of a single floating-point operation is 0.5
  // units in the last place.  On Intel CPU's, all floating-point
  // calculations are done with 80-bit precision, while double has 64
  // bits.  Therefore, 4 should be enough for ordinary use.
  //
  // See the following article for more details on ULP:
  // http://www.cygnus-software.com/papers/comparingfloats/comparingfloats.htm.
  static const size_t kMaxUlps = 4;

  // Constructs a FloatingPoint from a raw floating-point number.
  //
  // On an Intel CPU, passing a non-normalized NAN (Not a Number)
  // around may change its bits, although the new value is guaranteed
  // to be also a NAN.  Therefore, don't expect this constructor to
  // preserve the bits in x when x is a NAN.
  explicit FloatingPoint(const RawType& x) { u_.value_ = x; }

  // Static methods

  // Reinterprets a bit pattern as a floating-point number.
  //
  // This function is needed to test the AlmostEquals() method.
  static RawType ReinterpretBits(const Bits bits) {
    FloatingPoint fp(0);
    fp.u_.bits_ = bits;
    return fp.u_.value_;
  }

  // Returns the floating-point number that represent positive infinity.
  static RawType Infinity() {
    return ReinterpretBits(kExponentBitMask);
  }

  // Non-static methods

  // Returns the bits that represents this number.
  const Bits &bits() const { return u_.bits_; }

  // Returns the exponent bits of this number.
  Bits exponent_bits() const { return kExponentBitMask & u_.bits_; }

  // Returns the fraction bits of this number.
  Bits fraction_bits() const { return kFractionBitMask & u_.bits_; }

  // Returns the sign bit of this number.
  Bits sign_bit() const { return kSignBitMask & u_.bits_; }

  // Returns true iff this is NAN (not a number).
  bool is_nan() const {
    // It's a NAN if the exponent bits are all ones and the fraction
    // bits are not entirely zeros.
    return (exponent_bits() == kExponentBitMask) && (fraction_bits() != 0);
  }

  // Returns true iff this number is at most kMaxUlps ULP's away from
  // rhs.  In particular, this function:
  //
  //   - returns false if either number is (or both are) NAN.
  //   - treats really large numbers as almost equal to infinity.
  //   - thinks +0.0 and -0.0 are 0 DLP's apart.
  bool AlmostEquals(const FloatingPoint& rhs) const {
    // The IEEE standard says that any comparison operation involving
    // a NAN must return false.
    if (is_nan() || rhs.is_nan()) return false;

    return DistanceBetweenSignAndMagnitudeNumbers(u_.bits_, rhs.u_.bits_)
        <= kMaxUlps;
  }

 private:
  // The data type used to store the actual floating-point number.
  union FloatingPointUnion {
    RawType value_;  // The raw floating-point number.
    Bits bits_;      // The bits that represent the number.
  };

  // Converts an integer from the sign-and-magnitude representation to
  // the biased representation.  More precisely, let N be 2 to the
  // power of (kBitCount - 1), an integer x is represented by the
  // unsigned number x + N.
  //
  // For instance,
  //
  //   -N + 1 (the most negative number representable using
  //          sign-and-magnitude) is represented by 1;
  //   0      is represented by N; and
  //   N - 1  (the biggest number representable using
  //          sign-and-magnitude) is represented by 2N - 1.
  //
  // Read http://en.wikipedia.org/wiki/Signed_number_representations
  // for more details on signed number representations.
  static Bits SignAndMagnitudeToBiased(const Bits &sam) {
    if (kSignBitMask & sam) {
      // sam represents a negative number.
      return ~sam + 1;
    } else {
      // sam represents a positive number.
      return kSignBitMask | sam;
    }
  }

  // Given two numbers in the sign-and-magnitude representation,
  // returns the distance between them as an unsigned number.
  static Bits DistanceBetweenSignAndMagnitudeNumbers(const Bits &sam1,
                                                     const Bits &sam2) {
    const Bits biased1 = SignAndMagnitudeToBiased(sam1);
    const Bits biased2 = SignAndMagnitudeToBiased(sam2);
    return (biased1 >= biased2) ? (biased1 - biased2) : (biased2 - biased1);
  }

  FloatingPointUnion u_;
};

EDIT: This post is 4 years old. It's probably still valid, and the code is nice, but some people found improvements. Best go get the latest version of AlmostEquals right from the Google Test source code, and not the one I pasted up here.

Solution 5

For a more in depth approach read Comparing floating point numbers. Here is the code snippet from that link:

// Usable AlmostEqual function    
bool AlmostEqual2sComplement(float A, float B, int maxUlps)    
{    
    // Make sure maxUlps is non-negative and small enough that the    
    // default NAN won't compare as equal to anything.    
    assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024);    
    int aInt = *(int*)&A;    
    // Make aInt lexicographically ordered as a twos-complement int    
    if (aInt < 0)    
        aInt = 0x80000000 - aInt;    
    // Make bInt lexicographically ordered as a twos-complement int    
    int bInt = *(int*)&B;    
    if (bInt < 0)    
        bInt = 0x80000000 - bInt;    
    int intDiff = abs(aInt - bInt);    
    if (intDiff <= maxUlps)    
        return true;    
    return false;    
}

View more solutions

541,945

Author by

Admin

Updated on July 16, 2022

Comments

Admin almost 2 years
What would be the most efficient way to compare two double or two float values?

Simply doing this is not correct:
```
bool CompareDoubles1 (double A, double B)
{
   return A == B;
}
```
But something like:
```
bool CompareDoubles2 (double A, double B) 
{
   diff = A - B;
   return (diff < EPSILON) && (-diff < EPSILON);
}
```
Seems to waste processing.

Does anyone know a smarter float comparer?
Lazer about 14 years

@OJ: is there something wrong with the first code sample? I thought the only problem was in a situation like this : float a = 3.4; if(a == 3.4){...} i.e when you are comparing a stored floating point with a literal | In this case, both numbers are stored, so they will have the same representation, if equal, so what is the harm in doing a == b?
peterchen almost 14 years

+1 for pointing out the obvious (that often gets ignored). For a generic method, you can make the epsilon relative to fabs(a)+fabs(b) but with compensating for NaN, 0 sum and overflow, this gets quite complex.
peterchen almost 14 years

@Lazer: cygnus-software.com/papers/comparingfloats/comparingfloats.h‌tm.
Sam almost 14 years

You'll want a multiple of that epsilon most likely.
kolistivra over 13 years

Can't you just use std::abs? AFAIK, std::abs is overloaded for doubles as well. Please warn me if I'm wrong.
sehe almost 13 years

+1 for good information. However, I fail to see how you could mess up the equality comparison by increasing the relative error; IMHO the error becomes significant only in the result of the subtraction, however it's order of magnitude relative to that of the two operands being subtracted should still be reliable enough to judge equality. Unless of the resolution needs to be higher overall, but in that case the only solution is to move to a floating point representation with more significant bits in the mantissa.
unj2 almost 13 years

What is the suggested value of maxUlps?
osgx over 12 years

Will "*(int*)&A;" violate strict aliasing rule?
Nemo157 over 12 years

@DonReba: Only if EPSILON is defined as DBL_EPSILON. Normally it will be a specific value chosen depending on the required accuracy of the comparison.
jcoffland over 12 years

@kolistivra, you are wrong. The 'f' in 'fabs' does not mean the type float. You're probably thinking of the C functions fabsf() and fabsl().
AProgrammer over 12 years

Obviously false as it will consider two doubles bigger than 2 to be the same only if they are equals and while considering epsilon/1024 and epsilon to be the same.
Alan Baljeu about 12 years

No. It's copying, not aliasing.
EPonyA almost 12 years

According to gtest (search for ULP), 4 is an acceptable number.
Michael Burr over 11 years

And here are a couple updates to Bruce Dawson's paper (one of which is linked in the paper's intro): randomascii.wordpress.com/2012/02/25/… and randomascii.wordpress.com/2012/06/26/…
bobobobo over 11 years

Actually for reasons outlined in Bruce's article epsilon changes as the floating point value gets bigger. See the part where he says "For numbers larger than 2.0 the gap between floats grows larger and if you compare floats using FLT_EPSILON then you are just doing a more-expensive and less-obvious equality check."
artless noise about 11 years

There must be something I don't understand. The typical float/double is MANTISSA x 2^ EXP. epsilon will be dependent on the exponent. For example if the mantissa is 24bits and the exponent is signed 8bit, then 1/(2^24)*2^127 or ~2^103 is an epsilon for some values; or is this referring to a minimum epsilon?
artless noise about 11 years

+1: I agree this one is correct. However, it doesn't explain why. See here: cygnus-software.com/papers/comparingfloats/comparingfloats.h‌tm I read this blog post after I wrote my comment on the top score here; I believe it says the same thing and provides the rational/solution that is implemented above. Because there is so much code, people will miss the answer.
artless noise about 11 years

Wait a second. Is what I said what you meant? You are saying why |a-b|<epsilon, is not correct. Please add this link to your answer; if you agree cygnus-software.com/papers/comparingfloats/comparingfloats.h‌tm and I can remove my dumb comments.
Khaled.K almost 11 years

+1 Which arise the need for using a special error margin based on scenario and what are you comparing
mholzmann almost 11 years

i know this is old but std::abs is overloaded for floating point types in cmath.
mctylr over 10 years

Actually a very solid reference to a good answer, though it is very specialized to limit anyone without scientific computing or numerical analysis background (I.e. LAPACK, BLAS) to not understand the completeness. Or in other words, it assumes you've read something like Numerical Recipes introduction or Numerical Analysis by Burden & Faires.
Pascal Cuoq over 10 years

“EPSILON is the smallest value that could be added to 1.0 and change its value”: Actually, this honor goes to the successor of 0.5*EPSILON (in the default round-to-nearest mode). blog.frama-c.com/index.php?post/2013/05/09/FLT_EPSILON
kevintodisco over 10 years

EPSILON comparison does not work when the floats are large, as the difference between consecutive floats also becomes large. See this article.
Maxim Egorushkin about 10 years

No wonder there is Z-fighting in some games when textures/objects far away flicker, like in Battlefield 4. Comparing the difference with EPSILON is pretty much useless. You need to compare with a threshold that makes sense for the units at hand. Also, use std::abs since it is overloaded for different floating point types.
JeffCharter about 10 years

It took me a awhile to figure out what on ULP was: Units in the Last Place
JeffCharter about 10 years

There are a couple of nasty things that can happen when implicit casts occur doing say FloatPoint<double> fp(0.03f). I made a couple modifications to this to help prevent that. template<typename U> explicit FloatingPoint(const U& x) { if(typeid(U).name() != typeid(RawType).name()) { std::cerr << "You're doing an implicit conversion with FloatingPoint, Don't" << std::endl; assert(typeid(U).name() == typeid(RawType).name()); } u_.value_ = x; }
skrebbel about 10 years

Good find! I guess it would be best to contribute them to Google Test, though, where this code was stolen from. I'll update the post to reflect that probably there's a newer version. If the Google guys act itchy, could you put it in e.g. a GitHub gist? I'll link to that as well, then.
shelleybutterfly almost 10 years

+1: hey, I'm not going to do game programming with this, but the idea of round-tripping floats came up several times in Bruce Dawson's blog (treatise? :D) on the issue, and if you're trapped in a room and someone puts a gun to your head and says "hey you have to compare two floats to within X significant figures, you have 5 minutes, GO!" this is probably one to consider. ;)
Tommy Andersen over 9 years

@shelleybutterfly Then again the question was for the most efficient way of comparing two floating point numbers.
shelleybutterfly over 9 years

@TommyA lol perhaps, but I bet round-tripping was downvoted for reasons not efficiency-related. Though my intuition is that it would be rather inefficient compared to HW fp math but also says that the algoritihms in software fp are unlikely to have a bigO difference at least. I eagerly await the analysis you did showing the efficiency concerns in that case are significant. Besides, sometimes less-than-optimal can still be a valuable answer, and as it was downvoted–despite being a valid technique that was even mentioned by Dawson's blog on the subject, so I thought it deserved an upvote.
Gaston over 9 years

Am I the only one getting 'false' by comparing the doubles 0 with 1e-16? 0's biased representation is 9223372036854775808 while 1e-16's biased representation is 13590969439990876604. It seems to be a discontinuity in the representation or am I doing something wrong?
Ben Voigt about 9 years

Why do you think that EPSILON in the question is DBL_EPSILON or FLT_EPSILON? The problem is in your own imagination, where you substituted DBL_EPSILON (which indeed would be the wrong choice) into code that did not use it.
Don Reba about 9 years

@BenVoigt, you're right, it was something on my mind at the time, and I interpreted the question in that light.
user2261015 over 8 years

The article mentioned above gets the point: cygnus-software.com/papers/comparingfloats/comparingfloats.h‌tm -> AlmostEqualRelativeOrAbsolute(...) fits best. When using floating point you deal with relative errors, otherwise you wouldn't need floating point. The problem is, that most people use floating point because they are kind of lazy. It seems to be so easy. But its not. Most people will be better off with fixed point, but this is so exotic :)
Paulo Carvalho about 8 years

If anyone is interested, I made a version for doubles of @user2261015 's solution:bool almostEqual2sComplement(double A, double B, int maxUlps) { assert(maxUlps > 0 && maxUlps < 4 * 1024 * 1024); int64_t aLong = *reinterpret_cast<int64_t*>( &A ); if (aLong < 0) aLong = 0x8000000000000000 - aLong; int64_t bLong = *reinterpret_cast<int64_t*>( &B ); if (bLong < 0) bLong = 0x8000000000000000 - bLong; int64_t longDiff = (aLong - bLong) & 0x7FFFFFFFFFFFFFFF; if (longDiff <= maxUlps) return true; return false; }
Josh Milthorpe almost 8 years

@kolistivra is not wrong, although it is probably better to use std::fabs to avoid accidentally conversion to int. stackoverflow.com/questions/3118165/…
Merlyn Morgan-Graham almost 8 years

This is a very long comment, not an answer in itself. Is there a (set of) canonical answer(s) for all contexts?
SirGuy over 7 years

This method has many weaknesses, like if the numbers a and b are already smaller than epsilon() there difference may still be significant. Conversely if the numbers are very large then even a couple of bits of error will make the comparison fail even if you did want the numbers to be considered equal. This answer is exactly the type of "generic" comparison algorithm that you want to avoid.
rbaleksandar over 7 years

fabs(a - b) <= ( (fabs(a) < fabs(b) ? fabs(b) : fabs(a)) * epsilon); saved my live. LOL Note that this version (I haven't checked if is applies for the others too) also considers the change that might occur in the integral part of the floating point number (example: 2147352577.9999997616 == 2147352576.0000000000 where you can clearly see that there is almost a difference of 2 between the two numbers) which is quite nice! This happens when the accumulated rounding error overflows the decimal part of the number.
Marson Mao over 7 years

The old link seems to be obsolete, new page is here randomascii.wordpress.com/2012/02/25/…
Marson Mao over 7 years

The old link seems to be obsolete, new page is here randomascii.wordpress.com/2012/02/25/…
Bhargav Rao over 7 years

Please don't add the same answer to multiple questions. Answer the best one and flag the rest as duplicates. See meta.stackexchange.com/questions/104227/…
greybeard over 7 years

I do not think "epsilon comparison" possible using just ExOr (and one or two comparisons), even restricted to normalised representations in the same format.
Jaege over 7 years

For the newest code snippet, see here and here.
johnbakers about 7 years

there is a nice example how to do this with the standard library here: en.cppreference.com/w/cpp/types/numeric_limits/epsilon
user2261015 about 7 years

I downvoted since the example code shows the typical bug whis is repeated by the majority of programmers. Floating point is always about relative errors, since it is floating point (not fixed point). So it will never correctly work with a fixed error (epsilon).
miguel.martin over 6 years

What's the point of posting this solution if you don't explain how it works? Pointless answer, I don't see why this is upvoted.
Sneftel almost 6 years

That's not what epsilon is for.
Sneftel almost 6 years

Subtracting two nearly-equal numbers does NOT lead to catastrophic cancellation -- in fact, it does not introduce any error at all (q.v. Sterbenz's Theorem). Catastrophic cancellation occurs earlier, during the calculation of a and b themselves. There is absolutely no problem with using floating point subtraction as part of a fuzzy comparison (though as others have said, an absolute epsilon value may or may not be appropriate for a given use case.)
Steve Hollasch almost 6 years

No. If a or b are greater than or equal to one, then this is the same as using the equality operator. If a and b are less than one, then the relative size of epsilon will increase as the values of a and b get smaller. From en.cppreference.com/w/cpp/types/numeric_limits/epsilon, std::numeric_limits<T>::epsilon() “returns the machine epsilon, that is, the difference between 1.0 and the next value representable by the floating-point type T.”
Steve Hollasch almost 6 years

@JeffCharter or Units of Least Precision.
BobMorane over 5 years

Very nice and helpful article by Bruce Dawson, thanks!
debuti over 5 years

Why not? Can you explain it?
Razakhel over 5 years

Given that this question is tagged C++, your checks would be easier to read being written as std::max(std::abs(a), std::abs(b)) (or with std::min()); std::abs in C++ is overloaded with float & double types, so it works just fine (you can always keep fabs for readability though).
mwpowellhtx over 5 years

definitelyGreaterThan is reporting true for something that should definitely be equal to, i.e. not greater than.
mwpowellhtx over 5 years

Turns out the problem was in my code, difference between the original expected value and the parsed string.
Ben Voigt almost 5 years

@osgx: Yes, that violates strict aliasing. And the correct code is so simple: memcpy(&aInt, &A, sizeof (int)); Throw in static_assert(sizeof(int) == sizeof(float)); for good measure.
Sneftel almost 5 years

@debuti epsilon is merely the distance between 1 and the next representable number after 1. At best, that code is just trying to check whether the two numbers are exactly equal to each other, but because non-powers of 2 are being multiplied by epsilon, it isn't even doing that correctly.
Sneftel almost 5 years

Oh, and std::fabs(std::min(v1, v2)) is incorrect -- for negative inputs it picks the one with the larger magnitude.
Jaap Versteegh over 4 years

I believe return *(reinterpret_cast<double*>(&x)); although it usually works, is in fact undefined behaviour.
Steve Hollasch over 4 years

Fair point, though this code is illustrative, so sufficient to demonstrate the issue for numeric_limits<>::epsilon and IEEE 754 flooring point.
Jaap Versteegh over 4 years

Also a fair point, but it's not wise imho to post on stack overflow expecting that kind of insight. The code will be blindly copied making it ever harder to eradicate this very common pattern --together with the union trick-- which should just be avoided as all UD should.
qwr over 4 years

This does not answer the question.
qwr about 4 years

It will work correctly IF that is what your situation requires.
Matt Chambers about 4 years

isDefinitelyLessThan checks diff < tolerance, which means a and b are almost equal (and so a is not definitely less than b). Doesn't it make more sense to check diff > tolerance in both cases? Or perhaps add an orEqualTo argument which controls whether the approximate equality check should return true or not.
stijn about 4 years

This is exactly the same as many of the other answers except that it's a lambda and has no explanation, so this doesn't add much value as an answer.
derke almost 4 years

@Mehdi I've just tried with repl.it/repls/SvelteSimpleNumerator#main.cpp and it seems to behave as expected - but perhaps you have a particular compiler implementation to reference that doesn't do this?
Stefan Fabian almost 4 years

This is the same as A==B just with more instructions and does not address the problem of that comparison often being false simply due to numerical errors which is why you usually use something like std::abs(A-B) < epsilon. It also introduces the weird behavior that any valid or invalid float you compare with a NaN value will return true.
Shital Shah almost 4 years

For less than and greater than relationships, we need to use < and >.
algae over 3 years

@SirGuy How is this any different from the answer 3 posts down with 100+ upvotes?
SirGuy over 3 years

@algae if you're asking why I put this comment on this answer and not the same one with 100+ upvotes then I hadn't noticed is why.
SirGuy over 3 years

This method has many weaknesses, like if the numbers a and b are already smaller than EPSILON their difference may still be significant. Conversely if the numbers are very large then even a couple of bits of error will make the comparison fail even if you did want the numbers to be considered equal. You could make EPSILON a parameter to the function and then pass off the responsibility of choosing the value to the caller, but I think in practice people are not going think very hard about what they should pass in and so the problem will persist.
John Thoits over 3 years

Actually, if a==b and b==c then a==c is a true statement. It's when you start trying to use algebra and assume that precision errors won't show up that you run into trouble.
Gabriel Staples over 3 years

@SirGuy, please post an answer to demonstrate how to properly do this then, or link to one here. I'd like to see non-epsilon-based alternatives.
Gabriel Staples over 3 years

Continuing down the epsilon approach, however, one can get rid of fabs() entirely by using subtraction twice and && as I show in my new answer here: stackoverflow.com/a/65015333/4561887.
Gabriel Staples over 3 years

@SirGuy, I just added an approach in my answer which calculates a variable epsilon based on the floating point values being compared. I wonder if this solves the problems?
Gabriel Staples over 3 years

The question has been edited to be correct now. Both return (diff < EPSILON) && (diff > -EPSILON); and return (diff < EPSILON) && (-diff < EPSILON); are equivalent and both correct.
yyny over 3 years

@Ben Violating strict aliasing doesn't matter in this case since the pointer can never be aliased anyway (because the pointer is immediately dereferenced). This will only matter in a multithreaded environment, and if that creates a bug for you, then you have written some truly terrible code. In all other cases memcpy is a waste of processor and programmer overhead.
Sneftel about 3 years

For non-subnormal numbers in IEEE-754 your rewritten code is equivalent to checking for equality. An IEEE float is not capable of storing any numbers between 1-1e-8 and 1+1e-8 except for 1, so you hit catastrophic cancellation in any circumstances where a tolerance would come into play.
Oleksandr Boiko about 3 years

Thank you! I shall increase fraction value in example
Firstrock over 2 years

Are you (or Dr. Knuth) making a deliberate choice to use float arguments but calling double fabs()?