Trying to write a code for finding the machine epsilon

12,213

Solution 1

A guess why you're getting the same answer:

if( (1.0 + test) != 1.0)

Here 1.0 is a double constant, so it's promoting your float to a double and performing the addition as a double. You probably want to declare a temporary float here to perform the addition, or make those float numeric constants (1.0f IIRC).

You might also be falling into the excess-precision-in-temporary-floats problem and might need to force it to store the intermediates in memory to reduce to the correct precision.


Here's a quick go at redoing your range search method but computing the test in the correct type. I get an answer that's slightly too large, though.

#include <stdio.h>
#define N 100000
#define TYPE float

int main(void)
{
   TYPE max = 1.0, min = 0.0, test;
   int i;

   for (i = 0; i < N; i++)
   {
      TYPE one_plus_test;

      test = (max + min) / ((TYPE)2.0);
      one_plus_test = ((TYPE)1.0) + test;
      if (one_plus_test == ((TYPE)1.0))
      {
         min = test;
      }
      else
      {
         max = test;
      }
   }
   printf("The epsilon machine is %.50lf\n", max);
   return 0;
}

Solution 2

It depends upon what you mean by "precision level".

Floating-point numbers have "regular" (normal) values, but there are special, sub-normal numbers as well. If you want to find out different limits, the C standard has predefined constants:

#include <math.h>
#include <stdio.h>
#include <float.h>

int main(void)
{
    printf("%30s: %g\n", "FLT_EPSILON", FLT_EPSILON);
    printf("%30s: %g\n", "FLT_MIN", FLT_MIN);
    printf("%30s: %g\n", "nextafterf(0.0, 1.0)", nextafterf(0.0, 1.0));
    printf("%30s: %g\n", "nextafterf(1.0, 2.0)-1", (nextafterf(1.0, 2.0) - 1.0f));
    puts("");
    printf("%30s: %g\n", "DBL_EPSILON", DBL_EPSILON);
    printf("%30s: %g\n", "DBL_MIN", DBL_MIN);
    printf("%30s: %g\n", "nextafter(0.0, 1.0)", nextafter(0.0, 1.0));
    printf("%30s: %g\n", "nextafter(1.0, 2.0)-1", (nextafter(1.0, 2.0) - 1.0));
    puts("");
    printf("%30s: %Lg\n", "LDBL_EPSILON", LDBL_EPSILON);
    printf("%30s: %Lg\n", "LDBL_MIN", LDBL_MIN);
    printf("%30s: %Lg\n", "nextafterl(0.0, 1.0)", nextafterl(0.0, 1.0));
    printf("%30s: %Lg\n", "nextafterl(1.0, 2.0)-1", (nextafterl(1.0, 2.0) - 1.0));
    return 0;
}

The above program prints 4 values for each type:

  • the difference between 1 and the least value greater than 1 in that type (TYPE_EPSILON),
  • the minimum positive normalized value in a given type (TYPE_MIN). This does not include subnormal numbers,
  • the minimum positive value in a given type (nextafter*(0...)). This includes subnormal numbers,
  • the minimum number greater than 1. This is the same as TYPE_EPSILON, but calculated in a different way.

Depending upon what you mean by "precision", any or none of the above can be useful to you.

Here is the output of the above program on my computer:

               FLT_EPSILON: 1.19209e-07
                   FLT_MIN: 1.17549e-38
      nextafterf(0.0, 1.0): 1.4013e-45
    nextafterf(1.0, 2.0)-1: 1.19209e-07

               DBL_EPSILON: 2.22045e-16
                   DBL_MIN: 2.22507e-308
       nextafter(0.0, 1.0): 4.94066e-324
     nextafter(1.0, 2.0)-1: 2.22045e-16

              LDBL_EPSILON: 1.0842e-19
                  LDBL_MIN: 3.3621e-4932
      nextafterl(0.0, 1.0): 3.6452e-4951
    nextafterl(1.0, 2.0)-1: 1.0842e-19

Solution 3

IEEE 754 floating-point formats have the property that, when reinterpreted as a two's complement integer of the same width, they monotonically increase over positive values and monotonically decrease over negative values (see the binary representation of 32 bit floats). They also have the property that 0 < |f(x)| < ∞, and |f(x+1) − f(x)| ≥ |f(x) − f(x−1)| (where f(x) is the aforementioned integer reinterpretation of x). In languages that allow type punning and always use IEEE 754-1985, we can exploit this to compute a machine epsilon in constant time. For example, in C:

typedef union {
  long long i64;
  double d64;
} dbl_64;

double machine_eps (double value)
{
    dbl_64 s;
    s.d64 = value;
    s.i64++;
    return s.d64 - value;
}

From https://en.wikipedia.org/wiki/Machine_epsilon

Solution 4

I'm not sure how your algorithm is supposed to work. This one (C++) gives the correct answers:

#include <iostream>

template<typename T>
int epsilon() {
    int pow = 0;
    T eps = 1;
    while (eps + 1 != 1) {
        eps /= 2;
        --pow;
    }
    return pow + 1;
}

int main() {
    std::cout << "Epsilon for float: 2^" << epsilon<float>() << '\n';
    std::cout << "Epsilon for double: 2^" << epsilon<double>() << '\n';
}

This computes the smallest value such that, when added to 1, is still distinguishable from 1.

Output:

Epsilon for float: 2^-23
Epsilon for double: 2^-52

Solution 5

I would like to add that you can get the highest precision from a floating point calculation by using long double.

To apply that to @Rup's solution, just change the TYPE to long double and the printf statement to:

printf("The epsilon machine is %.50Lf\n", max);

This is the Epsilon on my machine using float:

0.00000005960465188081798260100185871124267578125000

And using long double:

0.00000000000000000005421010862427522170625011179761

The difference is quite significant.

Share:
12,213
JMzance
Author by

JMzance

Updated on June 08, 2022

Comments

  • JMzance
    JMzance almost 2 years

    I am trying to find out the precision level for various floating point formats in C (i.e. float, double and long double). Here is the code I'm using at the moment:

    #include <stdio.h>
    #define N 100000
    
    int main(void)
    {
       float max = 1.0, min = 0.0, test;
       int i;                              /* Counter for the conditional loop */
    
       for (i = 0; i < N; i++) {
          test = (max + min) / 2.0;
          if( (1.0 + test) != 1.0)         /* If too high, set max to test and try again */
         max = test;
      if( (1.0 + test) == 1.0)     /* If too low, set min to test and try again */
             min = test;
       }
       printf("The epsilon machine is %.50lf\n", max);
       return 0;
    }
    

    This gives the value of roughly ~2^-64 as expected. However when I change the decelerations to doubles or 'long doubles' I get the same answer I should get a smaller value but I don't. Anybody got any ideas?