Finding prime numbers with the Sieve of Eratosthenes (Originally: Is there a better way to prepare this array?)

java arrays primes sieve-of-eratosthenes

42,275

Solution 1

Your method of finding primes, by comparing every single element of the array with every possible factor is hideously inefficient. You can improve it immensely by doing a Sieve of Eratosthenes over the entire array at once. Besides doing far fewer comparisons, it also uses addition rather than division. Division is way slower.

Solution 2

`ArrayList<>` Sieve of Eratosthenes

// Return primes less than limit
static ArrayList<Integer> generatePrimes(int limit) {
    final int numPrimes = countPrimesUpperBound(limit);
    ArrayList<Integer> primes = new ArrayList<Integer>(numPrimes);
    boolean [] isComposite    = new boolean [limit];   // all false
    final int sqrtLimit       = (int)Math.sqrt(limit); // floor
    for (int i = 2; i <= sqrtLimit; i++) {
        if (!isComposite [i]) {
            primes.add(i);
            for (int j = i*i; j < limit; j += i) // `j+=i` can overflow
                isComposite [j] = true;
        }
    }
    for (int i = sqrtLimit + 1; i < limit; i++)
        if (!isComposite [i])
            primes.add(i);
    return primes;
}

Formula for upper bound of number of primes less than or equal to max (see wolfram.com):

static int countPrimesUpperBound(int max) {
    return max > 1 ? (int)(1.25506 * max / Math.log((double)max)) : 0;
}

Solution 3

Create an ArrayList<Integer> and then convert to an int[] at the end.

There are various 3rd party IntList (etc) classes around, but unless you're really worried about the hit of boxing a few integers, I wouldn't worry about it.

You could use Arrays.copyOf to create the new array though. You might also want to resize by doubling in size each time you need to, and then trim at the end. That would basically be mimicking the ArrayList behaviour.

Solution 4

Algo using Sieve of Eratosthenes

public static List<Integer> findPrimes(int limit) {

    List<Integer> list = new ArrayList<>();

    boolean [] isComposite = new boolean [limit + 1]; // limit + 1 because we won't use '0'th index of the array
    isComposite[1] = true;

    // Mark all composite numbers
    for (int i = 2; i <= limit; i++) {
        if (!isComposite[i]) {
            // 'i' is a prime number
            list.add(i);
            int multiple = 2;
            while (i * multiple <= limit) {
                isComposite [i * multiple] = true;
                multiple++;
            }
        }
    }

    return list;
}

Image depicting the above algo (Grey color cells represent prime number. Since we consider all numbers as prime numbers intially, the whole is grid is grey initially.)

Image Source: WikiMedia

Solution 5

Are you using Java 1.5? Why not return List<Integer> and use ArrayList<Integer>? If you do need to return an int[], you can do it by converting List to int[] at the end of processing.

View more solutions

42,275

eleven81

Updated on July 09, 2022

Comments

eleven81 almost 2 years
Note: Version 2, below, uses the Sieve of Eratosthenes. There are several answers that helped with what I originally asked. I have chosen the Sieve of Eratosthenes method, implemented it, and changed the question title and tags appropriately. Thanks to everyone who helped!

Introduction

I wrote this fancy little method that generates an array of int containing the prime numbers less than the specified upper bound. It works very well, but I have a concern.

The Method
```
private static int [] generatePrimes(int max) {
    int [] temp = new int [max];
    temp [0] = 2;
    int index = 1;
    int prime = 1;
    boolean isPrime = false;
    while((prime += 2) <= max) {
        isPrime = true;
        for(int i = 0; i < index; i++) {
            if(prime % temp [i] == 0) {
                isPrime = false;
                break;
            }
        }
        if(isPrime) {
            temp [index++] = prime;
        }
    }
    int [] primes = new int [index];
    while(--index >= 0) {
        primes [index] = temp [index];
    }
    return primes;
}
```
My Concern

My concern is that I am creating an array that is far too large for the final number of elements the method will return. The trouble is that I don't know of a good way to correctly guess the number of prime numbers less than a specified number.

Focus

This is how the program uses the arrays. This is what I want to improve upon.
1. I create a temporary array that is large enough to hold every number less than the limit.
2. I generate the prime numbers, while keeping count of how many I have generated.
3. I make a new array that is the right dimension to hold just the prime numbers.
4. I copy each prime number from the huge array to the array of the correct dimension.
5. I return the array of the correct dimension that holds just the prime numbers I generated.
Questions
1. Can I copy the whole chunk (at once) of temp[] that has nonzero elements to primes[] without having to iterate through both arrays and copy the elements one by one?
2. Are there any data structures that behave like an array of primitives that can grow as elements are added, rather than requiring a dimension upon instantiation? What is the performance penalty compared to using an array of primitives?
Version 2 (thanks to Jon Skeet):
```
private static int [] generatePrimes(int max) {
    int [] temp = new int [max];
    temp [0] = 2;
    int index = 1;
    int prime = 1;
    boolean isPrime = false;
    while((prime += 2) <= max) {
        isPrime = true;
        for(int i = 0; i < index; i++) {
            if(prime % temp [i] == 0) {
                isPrime = false;
                break;
            }
        }
        if(isPrime) {
            temp [index++] = prime;
        }
    }
    return Arrays.copyOfRange(temp, 0, index);
}
```
Version 3 (thanks to Paul Tomblin) which uses the Sieve of Erastosthenes:
```
private static int [] generatePrimes(int max) {
    boolean[] isComposite = new boolean[max + 1];
    for (int i = 2; i * i <= max; i++) {
        if (!isComposite [i]) {
            for (int j = i; i * j <= max; j++) {
                isComposite [i*j] = true;
            }
        }
    }
    int numPrimes = 0;
    for (int i = 2; i <= max; i++) {
        if (!isComposite [i]) numPrimes++;
    }
    int [] primes = new int [numPrimes];
    int index = 0;
    for (int i = 2; i <= max; i++) {
        if (!isComposite [i]) primes [index++] = i;
    }
    return primes;
}
```
- Matt Davison about 15 years
  
  why do you need a fixed size array?
- eleven81 about 15 years
  
  Matt Davison: I don't want to return an array that has a bunch of zero elements at the end, it feels so sloppy.
- Paul Tomblin about 15 years
  
  One micro-optimization I'd make: replace "for (int j = i; i*j..." with "for (int j = i; j <= max; j+=i) { isComposite[j] = true;}"
- eleven81 about 15 years
  
  Paul Tomblin: The replacement code you provided misses several prime numbers.
- Vishy about 14 years
  
  Using a BitSet is 8x more efficient than a large boolean[].
- om-nom-nom about 11 years
  
  @PeterLawrey it is efficient from memory point of view, but less efficient from CPU perspective.
- Vishy about 11 years
  
  @om-nom-nom If using BitSet fits in cache and byte[] doesn't it will be much faster too. Given you are likely to use large collections, making the best use of your cache is likely to be far more important than saving a little cpu.
Paul Tomblin about 15 years

Yes, that was just a placeholder until I could look it up on Wikipedia.
eleven81 about 15 years

Paul Tomblin: My method just returns an array of int representing the prime numbers less than the specified upper bound, I don't think I need to sieve anything.
Hank Gay about 15 years

The problem is the use of arrays - you can't actually remove an element from a Java array, just 0/null it out. The Sieve is orthogonal.
Paul Tomblin about 15 years

But you're checking every value of the array against every possible factor. By doing a Sieve, you check the whole array for primes in one shot, reducing the complexity from O(n^2) to O((nlogn)(loglogn))
eleven81 about 15 years

I like java.util.Arrays.copyOfRange(int [] original, int from, int to);
Paul Tomblin about 15 years

@Hank, no, the problem is that he's trying to optimize the half a millisecond it will take to convert his array, instead of the several seconds it will take to generate primes his way.
eleven81 about 15 years

Paul, I have implemented the Sieve of Eratosthenes like you suggested. For 10,000,000 elements the sieve method runs in 700 ms, compared to 4,800 ms for my method. Thanks!
Giovanni Botta over 10 years

My two cents in this blog post.
Dennis Vash about 5 years

Welcome to StackOverflow! You should add an addition explanation to your answer, check out How do I write a good answer?
nambastha over 3 years

prime counting function was one very interesting thing in your answer