Calculating Standard Deviation, Variance in Java

12,045

Here's a class that does it for long values without requiring an array or list. Modify as you wish.

package statistics;

import java.text.DecimalFormat;
import java.text.NumberFormat;

/**
 * Calculate statistics without having to maintain arrays or lists in memory
 * @link http://stackoverflow.com/questions/43675485/calculating-standard-deviation-variance-in-java
 */
public class StatisticsUtils {
    private static final String DEFAULT_FORMAT = "0.###";
    private static final NumberFormat FORMATTER = new DecimalFormat(DEFAULT_FORMAT);

    private long sum;
    private long squares;
    private long count;
    private long max;
    private long min;
    private long last;
    private long failureCount;
    private long resetCount;
    private String lastFailureReason;

    public StatisticsUtils() {
        reset();
    }

    public synchronized void addFailure(String reason) {
        this.lastFailureReason = reason;
        this.failureCount++;
    }

    public synchronized void addValue(long x) {
        sum += x;
        squares += x * x;
        min = ((x < min) ? x : min);
        max = ((x > max) ? x : max);
        last = x;
        ++count;

        // If the sum of squares exceeds Long.MAX_VALUE, this means the
        // value has overflowed; reset the state back to zero and start again.
        // All previous calculations are lost.  (Better as all doubles?)
        if (squares < 0L) {
            reset();
        }
    }

    public synchronized void reset() {
        sum = 0L;
        squares = 0L;
        count = 0L;
        max = Long.MIN_VALUE;
        min = Long.MAX_VALUE;
        last = 0L;
        this.resetCount++;
    }

    public synchronized double getMean() {
        double mean = 0.0;
        if (count > 0L) {
            mean = (double) sum/count;
        }
        return mean;
    }

    public synchronized double getVariance() {
        double variance = 0.0;
        if (count > 1L) {
            variance = (squares-(double)sum*sum/count)/(count-1);
        }
        return variance;
    }

    public synchronized double getStdDev() {
        return Math.sqrt(this.getVariance());
    }

    public synchronized long getCount() {
        return count;
    }

    public synchronized long getSum() {
        return sum;
    }

    public synchronized long getMax() {
        return max;
    }

    public synchronized long getMin() {
        return min;
    }

    public synchronized long getLast() {
        return last;
    }

    public synchronized String getLastFailureReason() {
        return lastFailureReason;
    }

    public synchronized long getFailureCount() {
        return failureCount;
    }

    public synchronized long getResetCount() {
        return resetCount;
    }

    public String toString() {
        return "StatisticsUtils{" +
                "sum=" + sum +
                ", min=" + min +
                ", max=" + max +
                ", last=" + last +
                ", squares=" + squares +
                ", count=" + count +
                ", mean=" + FORMATTER.format(getMean()) +
                ", dev=" + FORMATTER.format(getStdDev()) +
                '}';
    }
}
Share:
12,045
drai29
Author by

drai29

Updated on June 04, 2022

Comments

  • drai29
    drai29 almost 2 years

    Building Test Cases for many different aggregate functions.

    Trying to manually create formulas to calculate standard deviation, variance.

    So far,

    I have these equations, but I am a little bit off on my calculations

    //Calculate Standard Deviation
    double Sum1 = 0;
    double Sum2 = 0;
    long count = 0;
    
    
    
    public Object compute()
    {
    if (count > 0)
          return Math.sqrt(count*Sum2 - Math.pow(Sum1. 2))/ count;
    else
        return null;
    }
    
    //Calculate Variance
    double sum = 0;
    double count = 0;
    
    public Object compute()
    {
    if(count>0)
        return (Math.pow(sum-(sum/count),1))/count;
    else
        return null;
    

    Where I am getting stuck the most is I know when trying to calculate the variance I can calculate the mean by going:

    Math.pow(sum, 1)/count;
    

    Im having trouble with the next part which is trying to calculate the deviance of all these numbers.

    I have it setup where I am retrieving all the numbers in such a way to make these calculations work. Just don't have the equations right. Would be perfect if I could do it in a single equation instead of doing it separately. If anyone could help, would be greatly appreciated. Thank you