Fast transcendent / trigonometric functions for Java

java optimization math trigonometry

15,458

Solution 1

Computer Approximations by Hart. Tabulates Chebyshev-economized approximate formulas for a bunch of functions at different precisions.

Edit: Getting my copy off the shelf, it turned out to be a different book that just sounds very similar. Here's a sin function using its tables. (Tested in C since that's handier for me.) I don't know if this will be faster than the Java built-in, but it's guaranteed to be less accurate, at least. :) You may need to range-reduce the argument first; see John Cook's suggestions. The book also has arcsin and arctan.

#include <math.h>
#include <stdio.h>

// Return an approx to sin(pi/2 * x) where -1 <= x <= 1.
// In that range it has a max absolute error of 5e-9
// according to Hastings, Approximations For Digital Computers.
static double xsin (double x) {
  double x2 = x * x;
  return ((((.00015148419 * x2
             - .00467376557) * x2
            + .07968967928) * x2
           - .64596371106) * x2
          + 1.57079631847) * x;
}

int main () {
  double pi = 4 * atan (1);
  printf ("%.10f\n", xsin (0.77));
  printf ("%.10f\n", sin (0.77 * (pi/2)));
  return 0;
}

Solution 2

Here is a collection of low-level tricks for quickly approximating trig functions. There is example code in C which I find hard to follow, but the techniques are just as easily implemented in Java.

Here's my equivalent implementation of invsqrt and atan2 in Java.

I could have done something similar for the other trig functions, but I have not found it necessary as profiling showed that only sqrt and atan/atan2 were major bottlenecks.

public class FastTrig
{
  /** Fast approximation of 1.0 / sqrt(x).
   * See <a href="http://www.beyond3d.com/content/articles/8/">http://www.beyond3d.com/content/articles/8/</a>
   * @param x Positive value to estimate inverse of square root of
   * @return Approximately 1.0 / sqrt(x)
   **/
  public static double
  invSqrt(double x)
  {
    double xhalf = 0.5 * x; 
    long i = Double.doubleToRawLongBits(x);
    i = 0x5FE6EB50C7B537AAL - (i>>1); 
    x = Double.longBitsToDouble(i);
    x = x * (1.5 - xhalf*x*x); 
    return x; 
  }

  /** Approximation of arctangent.
   *  Slightly faster and substantially less accurate than
   *  {@link Math#atan2(double, double)}.
   **/
  public static double fast_atan2(double y, double x)
  {
    double d2 = x*x + y*y;

    // Bail out if d2 is NaN, zero or subnormal
    if (Double.isNaN(d2) ||
        (Double.doubleToRawLongBits(d2) < 0x10000000000000L))
    {
      return Double.NaN;
    }

    // Normalise such that 0.0 <= y <= x
    boolean negY = y < 0.0;
    if (negY) {y = -y;}
    boolean negX = x < 0.0;
    if (negX) {x = -x;}
    boolean steep = y > x;
    if (steep)
    {
      double t = x;
      x = y;
      y = t;
    }

    // Scale to unit circle (0.0 <= y <= x <= 1.0)
    double rinv = invSqrt(d2); // rinv ≅ 1.0 / hypot(x, y)
    x *= rinv; // x ≅ cos θ
    y *= rinv; // y ≅ sin θ, hence θ ≅ asin y

    // Hack: we want: ind = floor(y * 256)
    // We deliberately force truncation by adding floating-point numbers whose
    // exponents differ greatly.  The FPU will right-shift y to match exponents,
    // dropping all but the first 9 significant bits, which become the 9 LSBs
    // of the resulting mantissa.
    // Inspired by a similar piece of C code at
    // http://www.shellandslate.com/computermath101.html
    double yp = FRAC_BIAS + y;
    int ind = (int) Double.doubleToRawLongBits(yp);

    // Find φ (a first approximation of θ) from the LUT
    double φ = ASIN_TAB[ind];
    double cφ = COS_TAB[ind]; // cos(φ)

    // sin(φ) == ind / 256.0
    // Note that sφ is truncated, hence not identical to y.
    double sφ = yp - FRAC_BIAS;
    double sd = y * cφ - x * sφ; // sin(θ-φ) ≡ sinθ cosφ - cosθ sinφ

    // asin(sd) ≅ sd + ⅙sd³ (from first 2 terms of Maclaurin series)
    double d = (6.0 + sd * sd) * sd * ONE_SIXTH;
    double θ = φ + d;

    // Translate back to correct octant
    if (steep) { θ = Math.PI * 0.5 - θ; }
    if (negX) { θ = Math.PI - θ; }
    if (negY) { θ = -θ; }

    return θ;
  }

  private static final double ONE_SIXTH = 1.0 / 6.0;
  private static final int FRAC_EXP = 8; // LUT precision == 2 ** -8 == 1/256
  private static final int LUT_SIZE = (1 << FRAC_EXP) + 1;
  private static final double FRAC_BIAS =
    Double.longBitsToDouble((0x433L - FRAC_EXP) << 52);
  private static final double[] ASIN_TAB = new double[LUT_SIZE];
  private static final double[] COS_TAB = new double[LUT_SIZE];

  static
  {
    /* Populate trig tables */
    for (int ind = 0; ind < LUT_SIZE; ++ ind)
    {
      double v = ind / (double) (1 << FRAC_EXP);
      double asinv = Math.asin(v);
      COS_TAB[ind] = Math.cos(asinv);
      ASIN_TAB[ind] = asinv;
    }
  }
}

Solution 3

That might make it : http://sourceforge.net/projects/jafama/

Solution 4

I'm surprised that the built-in Java functions would be so slow. Surely the JVM is calling the native trig functions on your CPU, not implementing the algorithms in Java. Are you certain your bottleneck is calls to trig functions and not some surrounding code? Maybe some memory allocations?

Could you rewrite in C++ the part of your code that does the math? Just calling C++ code to compute trig functions probably wouldn't speed things up, but moving some context too, like an outer loop, to C++ might speed things up.

If you must roll your own trig functions, don't use Taylor series alone. The CORDIC algorithms are much faster unless your argument is very small. You could use CORDIC to get started, then polish the result with a short Taylor series. See this StackOverflow question on how to implement trig functions.

Solution 5

You could pre-store your sin and cos in an array if you only need some approximate values. For example, if you want to store the values from 0° to 360°:

double sin[]=new double[360];
for(int i=0;i< sin.length;++i) sin[i]=Math.sin(i/180.0*Math.PI):

you then use this array using degrees/integers instead of radians/double.

View more solutions

15,458

Author by

Hans-Peter Störr

I like to develop elegant and concise solutions for complex problems. Mostly I program in Java, but I do appreciate Scala very much, since it allows for many programming styles that help with that. I prefer wisely commented code, much unit testing and functional programming with immutable objects. And I do like maths very much.

Updated on June 06, 2022

Comments

Hans-Peter Störr about 2 years

Since the trigonometric functions in java.lang.Math are quite slow: is there a library that does a quick and good approximation? It seems possible to do a calculation several times faster without losing much precision. (On my machine a multiplication takes 1.5ns, and java.lang.Math.sin 46ns to 116ns). Unfortunately there is not yet a way to use the hardware functions.

UPDATE: The functions should be accurate enough, say, for GPS calculations. That means you would need at least 7 decimal digits accuracy, which rules out simple lookup tables. And it should be much faster than java.lang.Math.sin on your basic x86 system. Otherwise there would be no point in it.

For values over pi/4 Java does some expensive computations in addition to the hardware functions. It does so for a good reason, but sometimes you care more about the speed than for last bit accuracy.
Hans-Peter Störr over 15 years

commons math is a very good tip, but I did not find any faster replacement for Math.sin, for example. Is there?
Hans-Peter Störr over 15 years

The taylor series is probably not faster if you want to have reasonable accuracy. You have to do something more clever, like using piecewise polynomials.
Hans-Peter Störr over 15 years

As far as I know they do not use the hardware functions. The Javadoc of Math.sin says the result must be accurate up to the next to last bit, which hardware implementations might not meet. So it is in software.
Hans-Peter Störr over 15 years

Yes, but this is very inaccurate. I was thinking of something better, like polynomial interpolation.
mark over 15 years

Reminds of the good old pre-calculation days like doom ... Anyway, you can increase the accuracy by not generating just 360 values but e.g. 0xffff values.
wds over 15 years

Using JNI for a single Math.sin call probably won't work because of the overhead. Perhaps if you put more of your program in C, but then you could've written it in C to start with.
John D. Cook over 15 years

Use CORDIC to reduce the arguments before applying Taylor series. See stackoverflow.com/questions/345085/…
PhiLho over 15 years

@hstoerr: why inaccurate? It is as precise as the length of the array (ie. the granularity of the angle). That's the good old tradeoff between speed and memory, and performance is optimal here.
Hans-Peter Störr over 15 years

If you want to have 7 decimal digits accuracy, as you would need for GPS calculations, you would need 10000000 values. You probably don't want to precalculate that much, do you?
Hans-Peter Störr over 15 years

I tried on my system - 2ns for a multiplication, 46ns for Math.sin. That can't be hardware - sin isn't THAT much slower.
Jason S over 15 years

NEVER use Taylor series to approximate functions. see my comment stackoverflow.com/questions/345085/…
kquinn over 15 years

Yeah, it can. On the x87 FPU, multiplies are around 4 cycles, and sine is in the range of 100. So that result is entirely consistent with a 2GHz processor evaluating them in hardware.
Hans-Peter Störr over 15 years

OK, I'll have to check the same thing in C++ or something. Still: the time to calculate depends on the argument. If you calculate the sin of 0.1 it takes 46ns, if you calculate the sin of 6.28 it's 115ns. That's not the hardware, isn't it?
David Schmitt over 14 years

hstoerr: the bug cited by Bruce ONeel has details why bigger arguments lead to longer calculation times. Basically, Intel's sin/cos implementation sucks golfballs through gardenhoses for arguments outside of [-pi/4,pi/4] and the JVM has to manually map the argument into this range.
dotancohen almost 12 years

I'm not going to downvote due to the variable names, but would you like to maintain code with identifiers such as משתנה and פונקציה?
Saurabh almost 12 years

@dotancohen: Is that how my variable names render for you? I posted them in UTF-8. Sounds like your browser is guessing the encoding incorrectly (CP1255?)
dotancohen almost 12 years

No, my browser properly renders the Greek. However, if Greek identifiers are fair game, then why not Hebrew, or even Korean? I was trying to illustrate that although our tools can be abused, we really shouldn't abuse them for the sake of those who come after us. Even in code that is mean for "internal use only", your son or undergrad might inherit it! And yes, my three year old said "public static void main" today!
Saurabh almost 12 years

@dotancohen, because I am using those Greek letters only for well-established meanings (i.e. known to most English speakers with a bit of math knowledge.)
dotancohen almost 12 years

Yes, but how would you like the maintainer to type them? Or when he updates Java N to Java N+Z where N+Z needs foobar() replaced with doodad() and the Perl script that does the conversion doesn't support codepoints > 127. Yes, this happened in PHP 3 -> 4 and also Python 2 -> 3. Lots of legacy Java code won't run in the latest JVM and a Perl script could fix and recompile those. That is just one example of many why I don't recommend using non-ASCII identifiers.
Saurabh almost 12 years

@dotancohen, I would blame the Perl script (not my source) for that. Java has always allowed all unicode characters (and any alphanumeric unicode character in identifiers.) Any tool that cannot deal with that is broken. It's not difficult to implement either. If the author hasn't considered non-ASCII characters, there's a good chance they have been complacent with other aspects of the script, and I would avoid it anyway.
Saurabh almost 12 years

If there's a good reason to avoid non-ASCII characters, it's editors. Only a handful can handle the necessary font-mixing (where a character is not in the user's preferred font but another installed font has it.)
dotancohen almost 12 years

You are right of course, concerning both the scripts and the editors. But robust programming is in identifying where the potential pitfalls will be and taking the appropriate measures. The problem is in the script, no doubt, but why code a program that won't parse in a broken script when you could just as easily code a program that will?
Ozzy about 11 years

Looks like some form of Taylor series approximation. See the general formula's here: en.wikipedia.org/wiki/…
DisplayName over 10 years

loving the unicode variable names! They actually make the math code much clearer!
mljrg over 7 years

@Bjarke yah, this one helps a lot θ = -θ ;-)