Fast sine and cosine function in java

11,084

Solution 1

Since you don't care much about accuracy store it in a table that is precomputed or only computed once, this is what I do when I want to avoid calls to Math which can be expensive when done alot.

Roughly

public class CosSineTable {
double[] cos = new double[361];
double[] sin = new double[361];
private static CosSineTable table = new CosSineTable();

private CosSineTable() {
    for (int i = 0; i <= 360; i++) {
        cos[i] = Math.cos(Math.toRadians(i));
        sin[i] = Math.sin(Math.toRadians(i));
    }
}

public double getSine(int angle) {
    int angleCircle = angle % 360;
    return sin[angleCircle];
}

public double getCos(int angle) {
    int angleCircle = angle % 360;
    return cos[angleCircle];
}

public static CosSineTable getTable() {
    return table;
}
}

I leave the optimization of the loop and methods to you.

Solution 2

A pre-calculated table's the way to go. Here's an implementation:

static final int precision = 100; // gradations per degree, adjust to suit

static final int modulus = 360*precision;
static final float[] sin = new float[modulus]; // lookup table
static { 
    // a static initializer fills the table
    // in this implementation, units are in degrees
    for (int i = 0; i<sin.length; i++) {
        sin[i]=(float)Math.sin((i*Math.PI)/(precision*180));
    }
}
// Private function for table lookup
private static float sinLookup(int a) {
    return a>=0 ? sin[a%(modulus)] : -sin[-a%(modulus)];
}

// These are your working functions:
public static float sin(float a) {
    return sinLookup((int)(a * precision + 0.5f));
}
public static float cos(float a) {
    return sinLookup((int)((a+90f) * precision + 0.5f));
}

On my laptop, these were about 6x faster than Math.sin.

I only used one table -- the cost of shifting a cosine into a sine wasn't really discernible.

I used floats, assuming that's what you'll likely use in your calculations, given your preference for performance over precision. It doesn't make much difference here, since the bottleneck is really just the array lookup.

Here are my benchmarks:

public static void main(String[] args) {
    int reps = 1<<23;
    int sets = 4;

    Q.pl("  Trial  sinTab  cosTab  sinLib");
    for(int i = 0; i<sets; i++) {
        Q.pf("%7d\t%7.2f\t%7.2f\t%7.2f\n", i, testSinTab(reps), testCosTab(reps), testSinLib(reps));
    }
}

private static float[] sample(int n) {
    Random rand = new Random();
    float[] values = new float[n];
    for (int i=0; i<n; i++) {
        values[i] = 400*(rand.nextFloat()*2-1);
    }
    return values;
}
private static float testSinTab(int n) {
    float[] sample = sample(n);
    long time = -System.nanoTime();
    for (int i=0; i<n; i++) {
        sample[i] = sin(sample[i]);
    }
    time += System.nanoTime();
    return (time/1e6f);
}
private static float testCosTab(int n) {
    float[] sample = sample(n);
    long time = -System.nanoTime();
    for (int i=0; i<n; i++) {
        sample[i] = cos(sample[i]);
    }
    time += System.nanoTime();
    return time/1e6f;
}
private static float testSinLib(int n) {
    float[] sample = sample(n);
    long time = -System.nanoTime();
    for (int i=0; i<n; i++) {
        sample[i] = (float) Math.sin(sample[i]);
    }
    time += System.nanoTime();
    return time/1e6f;
}

output:

  Trial  sinTab  cosTab  sinLib
      0  102.51  111.19  596.57
      1   93.72   92.20  578.22
      2  100.06  107.20  600.68
      3  103.65  102.67  629.86

Solution 3

You can try http://sourceforge.net/projects/jafama/

It uses look-up tables, so it might actually be slower than Math, especially if the tables are often evicted from CPU cache, but for thousands of successive calls it can be quite faster.

It also seems slower during class load (maybe the JIT doesn't kicks in then yet), so you might want to avoid it in that particular use-case.

Share:
11,084
Admin
Author by

Admin

Updated on June 04, 2022

Comments

  • Admin
    Admin about 2 years

    I know about the Math.sin() and Math.cos() functions, but I'm wondering if there's a way I can create (or use an already-existing) a faster function, given that I don't care about pinpoint accuracy. I'm looking to execute a basic sin or cos calculation, and have it perform essentially as fast as possible. Would simply iterating the sigma a few times be any faster than Math.sin()?

  • maybeWeCouldStealAVan
    maybeWeCouldStealAVan about 11 years
    I like this plan, but it won't work for negative angles as is. % isn't modulo in the mathematical sense, but rather the remainder. Easily fixed, though.
  • arynaq
    arynaq about 11 years
    Aye just need to return -sin[angle] for negatives, cos is the same.
  • RGV
    RGV about 11 years
    the OP has asked for a faster solution and not a slow one.
  • supercat
    supercat about 10 years
    If one uses a tables of 17 sine/cosine pairs within an octant (272 bytes), a fourth-order polynomial approximation will suffice to achieve accuracy of about half a part per billion. Depending upon the size of the CPU cache and usage patterns, it may be helpful to expand that to 33 pairs for a quadrant [allowing the table to hold sine/cosine pairs in consecutive array slots without requiring them to be conditionally swapped], or even to use a switch statement to select one of 17 pairs of forumulas, or 33 formulas or pairs thereof (depending upon whether one wants just sin, just cos, or both).
  • supercat
    supercat about 10 years
    A compromise would be to use 33 groups of five double values which would be used in polynomial approximations with the residue after selecting an quadrant, or 129 groups of five for a whole circle (avoiding conditional logic). The larger table would take about 5K, which might be a problem for machines with small caches, but would replace four multiples with three consecutive-item fetches when computing sine alone, or eight multiplies with eight consecutive-item fetches when computing both.
  • NateS
    NateS almost 8 years
    LUT benchmarks will be fast since they can use the CPU cache. FWIW, LUTs can wreck havoc on the CPU cache for real world application usage.
  • moonheart08
    moonheart08 over 5 years
    No, it's not sadly. It's actually slow enough that it usually gets replaced when the game is modded (See: BetterFPS/Optifine, which both override it). That's Notch's implementation, and it's not all that good :P