Finding prime numbers with the Sieve of Eratosthenes (Originally: Is there a better way to prepare this array?)

42,275

Solution 1

Your method of finding primes, by comparing every single element of the array with every possible factor is hideously inefficient. You can improve it immensely by doing a Sieve of Eratosthenes over the entire array at once. Besides doing far fewer comparisons, it also uses addition rather than division. Division is way slower.

Solution 2

ArrayList<> Sieve of Eratosthenes

// Return primes less than limit
static ArrayList<Integer> generatePrimes(int limit) {
    final int numPrimes = countPrimesUpperBound(limit);
    ArrayList<Integer> primes = new ArrayList<Integer>(numPrimes);
    boolean [] isComposite    = new boolean [limit];   // all false
    final int sqrtLimit       = (int)Math.sqrt(limit); // floor
    for (int i = 2; i <= sqrtLimit; i++) {
        if (!isComposite [i]) {
            primes.add(i);
            for (int j = i*i; j < limit; j += i) // `j+=i` can overflow
                isComposite [j] = true;
        }
    }
    for (int i = sqrtLimit + 1; i < limit; i++)
        if (!isComposite [i])
            primes.add(i);
    return primes;
}

Formula for upper bound of number of primes less than or equal to max (see wolfram.com):

static int countPrimesUpperBound(int max) {
    return max > 1 ? (int)(1.25506 * max / Math.log((double)max)) : 0;
}

Solution 3

Create an ArrayList<Integer> and then convert to an int[] at the end.

There are various 3rd party IntList (etc) classes around, but unless you're really worried about the hit of boxing a few integers, I wouldn't worry about it.

You could use Arrays.copyOf to create the new array though. You might also want to resize by doubling in size each time you need to, and then trim at the end. That would basically be mimicking the ArrayList behaviour.

Solution 4

Algo using Sieve of Eratosthenes

public static List<Integer> findPrimes(int limit) {

    List<Integer> list = new ArrayList<>();

    boolean [] isComposite = new boolean [limit + 1]; // limit + 1 because we won't use '0'th index of the array
    isComposite[1] = true;

    // Mark all composite numbers
    for (int i = 2; i <= limit; i++) {
        if (!isComposite[i]) {
            // 'i' is a prime number
            list.add(i);
            int multiple = 2;
            while (i * multiple <= limit) {
                isComposite [i * multiple] = true;
                multiple++;
            }
        }
    }

    return list;
}

Image depicting the above algo (Grey color cells represent prime number. Since we consider all numbers as prime numbers intially, the whole is grid is grey initially.)

enter image description here

Image Source: WikiMedia

Solution 5

Are you using Java 1.5? Why not return List<Integer> and use ArrayList<Integer>? If you do need to return an int[], you can do it by converting List to int[] at the end of processing.

Share:
42,275

Related videos on Youtube

eleven81
Author by

eleven81

Updated on July 09, 2022

Comments

  • eleven81
    eleven81 almost 2 years

    Note: Version 2, below, uses the Sieve of Eratosthenes. There are several answers that helped with what I originally asked. I have chosen the Sieve of Eratosthenes method, implemented it, and changed the question title and tags appropriately. Thanks to everyone who helped!

    Introduction

    I wrote this fancy little method that generates an array of int containing the prime numbers less than the specified upper bound. It works very well, but I have a concern.

    The Method

    private static int [] generatePrimes(int max) {
        int [] temp = new int [max];
        temp [0] = 2;
        int index = 1;
        int prime = 1;
        boolean isPrime = false;
        while((prime += 2) <= max) {
            isPrime = true;
            for(int i = 0; i < index; i++) {
                if(prime % temp [i] == 0) {
                    isPrime = false;
                    break;
                }
            }
            if(isPrime) {
                temp [index++] = prime;
            }
        }
        int [] primes = new int [index];
        while(--index >= 0) {
            primes [index] = temp [index];
        }
        return primes;
    }
    

    My Concern

    My concern is that I am creating an array that is far too large for the final number of elements the method will return. The trouble is that I don't know of a good way to correctly guess the number of prime numbers less than a specified number.

    Focus

    This is how the program uses the arrays. This is what I want to improve upon.

    1. I create a temporary array that is large enough to hold every number less than the limit.
    2. I generate the prime numbers, while keeping count of how many I have generated.
    3. I make a new array that is the right dimension to hold just the prime numbers.
    4. I copy each prime number from the huge array to the array of the correct dimension.
    5. I return the array of the correct dimension that holds just the prime numbers I generated.

    Questions

    1. Can I copy the whole chunk (at once) of temp[] that has nonzero elements to primes[] without having to iterate through both arrays and copy the elements one by one?
    2. Are there any data structures that behave like an array of primitives that can grow as elements are added, rather than requiring a dimension upon instantiation? What is the performance penalty compared to using an array of primitives?

    Version 2 (thanks to Jon Skeet):

    private static int [] generatePrimes(int max) {
        int [] temp = new int [max];
        temp [0] = 2;
        int index = 1;
        int prime = 1;
        boolean isPrime = false;
        while((prime += 2) <= max) {
            isPrime = true;
            for(int i = 0; i < index; i++) {
                if(prime % temp [i] == 0) {
                    isPrime = false;
                    break;
                }
            }
            if(isPrime) {
                temp [index++] = prime;
            }
        }
        return Arrays.copyOfRange(temp, 0, index);
    }
    

    Version 3 (thanks to Paul Tomblin) which uses the Sieve of Erastosthenes:

    private static int [] generatePrimes(int max) {
        boolean[] isComposite = new boolean[max + 1];
        for (int i = 2; i * i <= max; i++) {
            if (!isComposite [i]) {
                for (int j = i; i * j <= max; j++) {
                    isComposite [i*j] = true;
                }
            }
        }
        int numPrimes = 0;
        for (int i = 2; i <= max; i++) {
            if (!isComposite [i]) numPrimes++;
        }
        int [] primes = new int [numPrimes];
        int index = 0;
        for (int i = 2; i <= max; i++) {
            if (!isComposite [i]) primes [index++] = i;
        }
        return primes;
    }
    
    • Matt Davison
      Matt Davison about 15 years
      why do you need a fixed size array?
    • eleven81
      eleven81 about 15 years
      Matt Davison: I don't want to return an array that has a bunch of zero elements at the end, it feels so sloppy.
    • Paul Tomblin
      Paul Tomblin about 15 years
      One micro-optimization I'd make: replace "for (int j = i; i*j..." with "for (int j = i; j <= max; j+=i) { isComposite[j] = true;}"
    • eleven81
      eleven81 about 15 years
      Paul Tomblin: The replacement code you provided misses several prime numbers.
    • Vishy
      Vishy about 14 years
      Using a BitSet is 8x more efficient than a large boolean[].
    • om-nom-nom
      om-nom-nom about 11 years
    • Vishy
      Vishy about 11 years
      @om-nom-nom If using BitSet fits in cache and byte[] doesn't it will be much faster too. Given you are likely to use large collections, making the best use of your cache is likely to be far more important than saving a little cpu.
  • Paul Tomblin
    Paul Tomblin about 15 years
    Yes, that was just a placeholder until I could look it up on Wikipedia.
  • eleven81
    eleven81 about 15 years
    Paul Tomblin: My method just returns an array of int representing the prime numbers less than the specified upper bound, I don't think I need to sieve anything.
  • Hank Gay
    Hank Gay about 15 years
    The problem is the use of arrays - you can't actually remove an element from a Java array, just 0/null it out. The Sieve is orthogonal.
  • Paul Tomblin
    Paul Tomblin about 15 years
    But you're checking every value of the array against every possible factor. By doing a Sieve, you check the whole array for primes in one shot, reducing the complexity from O(n^2) to O((nlogn)(loglogn))
  • eleven81
    eleven81 about 15 years
    I like java.util.Arrays.copyOfRange(int [] original, int from, int to);
  • Paul Tomblin
    Paul Tomblin about 15 years
    @Hank, no, the problem is that he's trying to optimize the half a millisecond it will take to convert his array, instead of the several seconds it will take to generate primes his way.
  • eleven81
    eleven81 about 15 years
    Paul, I have implemented the Sieve of Eratosthenes like you suggested. For 10,000,000 elements the sieve method runs in 700 ms, compared to 4,800 ms for my method. Thanks!
  • Giovanni Botta
    Giovanni Botta over 10 years
    My two cents in this blog post.
  • Dennis Vash
    Dennis Vash about 5 years
    Welcome to StackOverflow! You should add an addition explanation to your answer, check out How do I write a good answer?
  • nambastha
    nambastha over 3 years
    prime counting function was one very interesting thing in your answer