Most elegant way to generate prime numbers

78,348

Solution 1

Many thanks to all who gave helpful answers. Here are my implementations of a few different methods of finding the first n primes in C#. The first two methods are pretty much what was posted here. (The posters names are next to the title.) I plan on doing the sieve of Atkin sometime, although I suspect it won't be quite as simple as the methods here currently. If anybody can see any way of improving any of these methods I'd love to know :-)

Standard Method (Peter Smit, jmservera, Rekreativc)

The first prime number is 2. Add this to a list of primes. The next prime is the next number that is not evenly divisible by any number on this list.

public static List<int> GeneratePrimesNaive(int n)
{
    List<int> primes = new List<int>();
    primes.Add(2);
    int nextPrime = 3;
    while (primes.Count < n)
    {
        int sqrt = (int)Math.Sqrt(nextPrime);
        bool isPrime = true;
        for (int i = 0; (int)primes[i] <= sqrt; i++)
        {
            if (nextPrime % primes[i] == 0)
            {
                isPrime = false;
                break;
            }
        }
        if (isPrime)
        {
            primes.Add(nextPrime);
        }
        nextPrime += 2;
    }
    return primes;
}

This has been optimised by only testing for divisibility up to the square root of the number being tested; and by only testing odd numbers. This can be further optimised by testing only numbers of the form 6k+[1, 5], or 30k+[1, 7, 11, 13, 17, 19, 23, 29] or so on.

Sieve of Eratosthenes (starblue)

This finds all the primes to k. To make a list of the first n primes, we first need to approximate value of the nth prime. The following method, as described here, does this.

public static int ApproximateNthPrime(int nn)
{
    double n = (double)nn;
    double p;
    if (nn >= 7022)
    {
        p = n * Math.Log(n) + n * (Math.Log(Math.Log(n)) - 0.9385);
    }
    else if (nn >= 6)
    {
        p = n * Math.Log(n) + n * Math.Log(Math.Log(n));
    }
    else if (nn > 0)
    {
        p = new int[] { 2, 3, 5, 7, 11 }[nn - 1];
    }
    else
    {
        p = 0;
    }
    return (int)p;
}

// Find all primes up to and including the limit
public static BitArray SieveOfEratosthenes(int limit)
{
    BitArray bits = new BitArray(limit + 1, true);
    bits[0] = false;
    bits[1] = false;
    for (int i = 0; i * i <= limit; i++)
    {
        if (bits[i])
        {
            for (int j = i * i; j <= limit; j += i)
            {
                bits[j] = false;
            }
        }
    }
    return bits;
}

public static List<int> GeneratePrimesSieveOfEratosthenes(int n)
{
    int limit = ApproximateNthPrime(n);
    BitArray bits = SieveOfEratosthenes(limit);
    List<int> primes = new List<int>();
    for (int i = 0, found = 0; i < limit && found < n; i++)
    {
        if (bits[i])
        {
            primes.Add(i);
            found++;
        }
    }
    return primes;
}

Sieve of Sundaram

I only discovered this sieve recently, but it can be implemented quite simply. My implementation isn't as fast as the sieve of Eratosthenes, but it is significantly faster than the naive method.

public static BitArray SieveOfSundaram(int limit)
{
    limit /= 2;
    BitArray bits = new BitArray(limit + 1, true);
    for (int i = 1; 3 * i + 1 < limit; i++)
    {
        for (int j = 1; i + j + 2 * i * j <= limit; j++)
        {
            bits[i + j + 2 * i * j] = false;
        }
    }
    return bits;
}

public static List<int> GeneratePrimesSieveOfSundaram(int n)
{
    int limit = ApproximateNthPrime(n);
    BitArray bits = SieveOfSundaram(limit);
    List<int> primes = new List<int>();
    primes.Add(2);
    for (int i = 1, found = 1; 2 * i + 1 <= limit && found < n; i++)
    {
        if (bits[i])
        {
            primes.Add(2 * i + 1);
            found++;
        }
    }
    return primes;
}

Solution 2

Use the estimate

pi(n) = n / log(n)

for the number of primes up to n to find a limit, and then use a sieve. The estimate underestimates the number of primes up to n somewhat, so the sieve will be slightly larger than necessary, which is ok.

This is my standard Java sieve, computes the first million primes in about a second on a normal laptop:

public static BitSet computePrimes(int limit)
{
    final BitSet primes = new BitSet();
    primes.set(0, false);
    primes.set(1, false);
    primes.set(2, limit, true);
    for (int i = 0; i * i < limit; i++)
    {
        if (primes.get(i))
        {
            for (int j = i * i; j < limit; j += i)
            {
                primes.clear(j);
            }
        }
    }
    return primes;
}

Solution 3

Ressurecting an old question, but I stumbled over it while playing with LINQ.

This Code Requires .NET4.0 or .NET3.5 With Parallel Extensions

public List<int> GeneratePrimes(int n) {
    var r = from i in Enumerable.Range(2, n - 1).AsParallel()
            where Enumerable.Range(1, (int)Math.Sqrt(i)).All(j => j == 1 || i % j != 0)
            select i;
    return r.ToList();
}

Solution 4

You are on the good path.

Some comments

  • primes.Add(3); makes that this function doesn't work for number = 1

  • You dont't have to test the division with primenumbers bigger that the squareroot of the number to be tested.

Suggested code:

ArrayList generatePrimes(int toGenerate)
{
    ArrayList primes = new ArrayList();

    if(toGenerate > 0) primes.Add(2);

    int curTest = 3;
    while (primes.Count < toGenerate)
    {

        int sqrt = (int) Math.sqrt(curTest);

        bool isPrime = true;
        for (int i = 0; i < primes.Count && primes.get(i) <= sqrt; ++i)
        {
            if (curTest % primes.get(i) == 0)
            {
                isPrime = false;
                break;
            }
        }

        if(isPrime) primes.Add(curTest);

        curTest +=2
    }
    return primes;
}

Solution 5

you should take a look at probable primes. In particular take a look to Randomized Algorithms and Miller–Rabin primality test.

For the sake of completeness you could just use java.math.BigInteger:

public class PrimeGenerator implements Iterator<BigInteger>, Iterable<BigInteger> {

    private BigInteger p = BigInteger.ONE;

    @Override
    public boolean hasNext() {
        return true;
    }

    @Override
    public BigInteger next() {
        p = p.nextProbablePrime();
        return p;
    }

    @Override
    public void remove() {
        throw new UnsupportedOperationException("Not supported.");
    }

    @Override
    public Iterator<BigInteger> iterator() {
        return this;
    }
}

@Test
public void printPrimes() {
    for (BigInteger p : new PrimeGenerator()) {
        System.out.println(p);
    }
}
Share:
78,348
David Johnstone
Author by

David Johnstone

Updated on July 08, 2022

Comments

  • David Johnstone
    David Johnstone almost 2 years

    What is the most elegant way to implement this function:

    ArrayList generatePrimes(int n)
    

    This function generates the first n primes (edit: where n>1), so generatePrimes(5) will return an ArrayList with {2, 3, 5, 7, 11}. (I'm doing this in C#, but I'm happy with a Java implementation - or any other similar language for that matter (so not Haskell)).

    I do know how to write this function, but when I did it last night it didn't end up as nice as I was hoping. Here is what I came up with:

    ArrayList generatePrimes(int toGenerate)
    {
        ArrayList primes = new ArrayList();
        primes.Add(2);
        primes.Add(3);
        while (primes.Count < toGenerate)
        {
            int nextPrime = (int)(primes[primes.Count - 1]) + 2;
            while (true)
            {
                bool isPrime = true;
                foreach (int n in primes)
                {
                    if (nextPrime % n == 0)
                    {
                        isPrime = false;
                        break;
                    }
                }
                if (isPrime)
                {
                    break;
                }
                else
                {
                    nextPrime += 2;
                }
            }
            primes.Add(nextPrime);
        }
        return primes;
    }
    

    I'm not too concerned about speed, although I don't want it to be obviously inefficient. I don't mind which method is used (naive or sieve or anything else), but I do want it to be fairly short and obvious how it works.

    Edit: Thanks to all who have responded, although many didn't answer my actual question. To reiterate, I wanted a nice clean piece of code that generated a list of prime numbers. I already know how to do it a bunch of different ways, but I'm prone to writing code that isn't as clear as it could be. In this thread a few good options have been proposed:

    • A nicer version of what I originally had (Peter Smit, jmservera and Rekreativc)
    • A very clean implementation of the sieve of Eratosthenes (starblue)
    • Use Java's BigIntegers and nextProbablePrime for very simple code, although I can't imagine it being particularly efficient (dfa)
    • Use LINQ to lazily generate the list of primes (Maghis)
    • Put lots of primes in a text file and read them in when necessary (darin)

    Edit 2: I've implemented in C# a couple of the methods given here, and another method not mentioned here. They all find the first n primes effectively (and I have a decent method of finding the limit to provide to the sieves).

  • David Božjak
    David Božjak almost 15 years
    I'm sure its very elegant to just read primes from a file, however I don't think this is really relevant to the question. He is probably more interested in how you generate the numbers in prime number generator...
  • stevenvh
    stevenvh almost 15 years
    Even if he says "I don't mind which method..." I don't think that includes "open a list of primes". That would be like answering the question "how to build a computer" by "buy a computer". -1
  • matbrgz
    matbrgz almost 15 years
    It is only the fastest for big primes - for small numbers the tight loop is much much faster.
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    It would be faster if you actually wrote the prime numbers in the source code itself, instead of reading them from a file.
  • Darin Dimitrov
    Darin Dimitrov almost 15 years
    Writing them in the source file would consume too much memory.
  • David Božjak
    David Božjak almost 15 years
    This is true, however I wasn't designing a algorithm, I was just showing him how his method can be made more elegant. So this version is equally wrong as the one in opening question.
  • David Johnstone
    David Johnstone almost 15 years
    Yeah, I'm a big fan of Haskell too (I just wish I knew it better)
  • jmservera
    jmservera almost 15 years
    Consume much memory? more than reading the full primes list as text into... memory? Do you know how strings work in .net?
  • David Johnstone
    David Johnstone almost 15 years
    It didn't occur to me that it was broken for n=1. I changed the question slightly so that the function only has to work for n>1 :-)
  • SND
    SND almost 15 years
    The list of primes is an infinite but immutable list so it makes prefect sense to use a precalculated list upto the likely upper bound for the application. Why waste time writing code that may be buggy when there is a correct public list available that can be used to meet the requirements.
  • David Johnstone
    David Johnstone almost 15 years
    That's a very nice implementation of the sieve of Eratosthenes
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    @AndyM only the OP will be able to tell if using a publicly available list is valid for his purposes. I do know that I would use it in some scenarios, in the same way that I use a publicly available value of pi.
  • Darin Dimitrov
    Darin Dimitrov almost 15 years
    I've modified my post so that the file is read line by line to avoid loading the whole list into memory.
  • David Johnstone
    David Johnstone almost 15 years
    Well... This answer wasn't what I had in mind when I asked the question :-) I don't mind it, and it's certainly thinking outside the box, but I have a feeling it will never be faster than a decent sieve implementation because by the time primes are sparse enough to be hard to find (see prime number theorem), the numbers will be so long you'll be spending just as much time reading them off the hard drive.
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    @David: I don't see how "sparsity" comes into play when reading to the file. The file only contains the prime numbers, so if you need 70 you just read the first 70 lines and if you need 70,000,000 you just read 70,000,000 lines. It doesn't matter if the numbers stored there are larger or smaller.
  • David Johnstone
    David Johnstone almost 15 years
    You're right, the sparsity of primes has nothing to do when reading from a text file. What I meant is that the rate at which a method that actually generates primes will be slowed down as primes get larger due to the sparseness of primes (and also the size of the factors).
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    @darin: That memory issue could be discussed. Written in code is not synonymous to loaded in memory. (OTOH, we could just ask stackoverflow.com/questions/1032427/… how he finally store the primes)
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    @David: One problem with this question is that if numbers really get that long, it costs lots of time and/or memory to do anything with them. Hence a more realistic problem would be needed.
  • David Johnstone
    David Johnstone almost 15 years
    I wish I knew LINQ enough to appreciate and understand this answer better :-) Also, I have a feeling that this isn't an implementation of the sieve of Eratosthenes, and that it works conceptually the same as my original function (find the next number that isn't divisible by any of the previously found primes).
  • Maghis
    Maghis almost 15 years
    Yes, but "find the next number that isn't divisible by any of the previously found primes (smaller then number)" is conceptually similar to the sieve of eratosthenes. If you prefer, I can refactor it a bit to make it more readable even if you are not familiar with LINQ. Are you familiar with iterators?
  • Maghis
    Maghis almost 15 years
    The thing I like of this approach is that the next prime is calculated just when the caller asks for it, so things like "take the first n primes" or "take the primes that are smaller then n" become trivial
  • David Johnstone
    David Johnstone almost 15 years
    Actually, yes, given what starblue and myself said in that efficient storage of primes question, primes can be packed quite efficiently into memory. You can save a file that indicates all the primes up to n in n * 8 / 30 bits by using the file as a bit array where each bit indicates the primality of a number that can be expressed as 30k+/-1,7,11,13...
  • David Johnstone
    David Johnstone almost 15 years
    Thanks, but I can understand that enough to more or less know what it's doing :-) I like the lazy evaluation, but I still wouldn't call this an implementation of the sieve of Eratosthenes.
  • Darin Dimitrov
    Darin Dimitrov almost 15 years
    @Daniel, when I said that written in code would consume more memory than if they were read from a file was that you will usually store them into some variable that you will access at runtime. Maybe I am missing your point but could you please give me an example of how you intend storing these list of prime numbers into memory?
  • Darin Dimitrov
    Darin Dimitrov almost 15 years
    I really don't see why my answer got downvoted. While maybe it's not the best answer it seemed to me like a valid solution to the problem.
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    @darin: I haven't elaborated much, but If I had to do that for production code, I'd try playing with constants saved in a resource file... oops... Maybe I'm missing my own point too :)
  • Prasanth Kumar
    Prasanth Kumar almost 15 years
    @darin: I myself upvoted it. I made a statement about speed and putting the numbers in the code, but nonetheless this may be not even a valid, but also the best alternative for really accessing a constant list of numbers fast. Also, it's generic and elegant in that the properties of the numbers themselves don't matter much, so it could be reused for Fibonacci numbers or whatever. In a real production world when time is money, this kind of strategies might well make a difference, especially when they don't compromise the clarity of the code at all.
  • Christoph
    Christoph almost 15 years
    shoultn't it be enough to loop while i <= Math.sqrt(limit) in the outer loop?
  • SND
    SND almost 15 years
    To improve performance you could have cache - a static list<int> that holds in memory all the primes upto the maximum that has been requested. Thus you only read from slow disk once and then only when absolutely necessary. Thus we have a best case runtime of O(n) (to build the list), dramatically better than the other solutions.
  • starblue
    starblue almost 15 years
    @Christoph Yes, it is ok, and the floating point precision is good enough to represent the relevant integers exactly. It reduces runtime by about 20%, so I changed it.
  • yairchu
    yairchu almost 15 years
    testing that prime*prime <= curTest in the loop instead of pre-calculating the square root will probably make it faster and will make it more generic (will work for bignums, etc)
  • sidgeon smythe
    sidgeon smythe almost 15 years
    I'd expect that it would reduce runtime even more if you either calculate Math.sqrt(limit) outside the loop instead of calculating it in each iteration, or change the iteration condition to i*i<=limit. Do they, and by how much?
  • starblue
    starblue almost 15 years
    Ok, updated to next version. :-) With the tighter bound we don't need long in the inner loop, which saves another 10 to 15% runtime. i * i <= limit is minimally faster than i < Math.sqrt(limit), I think because it does the comparison in integers and avoids conversion to floating point. Extracting the sqrt and casting to in is the same, extracting sqrt but not casting to int gains nothing.
  • Luis Filipe
    Luis Filipe almost 15 years
    Why using the square root? What's the mathematical background for such option? I, probably dully, would only divide by 2.
  • David Johnstone
    David Johnstone almost 15 years
    Because if a number has prime factors, at least one of them must be less than or equal to the square root. If a * b = c and a <= b then a <= sqrt(c) <= b.
  • David Johnstone
    David Johnstone almost 15 years
    The estimate pi(n) = n / log(n) underestimates the nth prime, so the sieve will be smaller than necessary. Luckily, I've worked out a better way of doing it: stackoverflow.com/questions/1042717/…
  • starblue
    starblue almost 15 years
    @David Johnstone No, pi(n) = n / log(n) underestimates the number of primes up to n, which goes in the opposite direction. I'm glad you found a much nicer approximation, though.
  • Accipitridae
    Accipitridae over 14 years
    Did someone actually check, whether reading primes from a file is faster than generating them with the sieve of Eratosthenes? The sieve runs in time O(n log(log(n))) and all its steps are very simple. Thus it is well possible that computing primes is significantly faster than reading them slowly from a hard disc.
  • Letterman
    Letterman over 14 years
    i'd do that with a bool instead of enum...
  • Nick Larsen
    Nick Larsen over 13 years
    if you are willing to remove all of the multiples of 2 in its own loop, you can use j+= 2 * i as your loop increment to save some extra runtime, and you can calculate that once using a bit shift
  • starblue
    starblue over 13 years
    By replacing BitSet by a class implementing wheel factorization for 2, 3 and 5 it becomes almost 3 times faster.
  • Brian
    Brian over 13 years
    Miller-Rabbin is very fast and the code is very simple. Giving it sufficient iterations makes it reliable enough to be in competition with random CPU failure in terms of likelihood of a false positive. The downside of the algorithm is that understanding why it actually works is a difficult task.
  • Jacobs Data Solutions
    Jacobs Data Solutions over 12 years
    FYI - I had to change your main loop counter to "for (int i = 0; i * i <= limit && i * i > 0; i++)" in order to prevent an overflow.
  • Will Ness
    Will Ness about 11 years
    this admits squares of primes as prime numbers.
  • Dave Black
    Dave Black about 11 years
    I'm nearly positive that the calculation of the square root is optimized out by the JIT compiler (when compiled with optimization enabled). You'd have to verify this by examining the assembly generated (the IL is only partially optimized and is nowhere near the optimization performed by the JIT compiler. The days of loop hoisting and other micro optimizations are way over. In fact, sometimes trying to outsmart the JIT can slow down your code.
  • Will Ness
    Will Ness about 10 years
    @Accipitridae exactly; and not only is it well possible, but it is also entirely plausible and very likely that the sieve is faster, and in fact I saw it stated several times that it indeed is. Faster. :) :)
  • jahackbeth
    jahackbeth about 9 years
    This implementation of the Sieve of Sundaram is one of the very few correct ones out there. Most of them use wrong bounds for i and j while calculating i+j+2*i*j leading to incorrect output.
  • greybeard
    greybeard almost 9 years
    Trying it, I could gauge its effectiveness and presentation of resutls: please argue its elegance.
  • riz
    riz almost 9 years
    The styling of result is just an added part. Let me discuss the algorithm for return true/false as prime number. n%2 will eliminate half of the numbers because even number are always divisible by 2. In else code, I am dividing by odd numbers only, increasing divisible by two (so next divisible is also odd) upto half of that number which is prime or not. Why half, to not waste time because it will give us answer in fraction.
  • Will Ness
    Will Ness over 8 years
    log10(63)~=1.8, i.e. your data shows growth rate of n^1.8. This is very slow; optimal sieve of Eratosthenes implementations exibit ~ n^1.01..1.05; optimal trial division ~ n^1.35..1.45. Your isPrimeNubmer indeed implement the suboptimal tril division; its asymptotycs will worsen to about n^2 (or even above it) when you try to generate even more primes.
  • Avrohom Yisroel
    Avrohom Yisroel almost 4 years
    Why isn't this the accepted answer? The code here is much shorter, more elegant and way faster than the code in the accepted answer. Wish I could upvote more than once!