Most efficient code for the first 10000 prime numbers?

performance algorithm primes

69,096

Solution 1

The Sieve of Atkin is probably what you're looking for, its upper bound running time is O(N/log log N).

If you only run the numbers 1 more and 1 less than the multiples of 6, it could be even faster, as all prime numbers above 3 are 1 away from some multiple of six. Resource for my statement

Solution 2

I recommend a sieve, either the Sieve of Eratosthenes or the Sieve of Atkin.

The sieve or Eratosthenes is probably the most intuitive method of finding a list of primes. Basically you:

Write down a list of numbers from 2 to whatever limit you want, let's say 1000.
Take the first number that isn't crossed off (for the first iteration this is 2) and cross off all multiples of that number from the list.
Repeat step 2 until you reach the end of the list. All the numbers that aren't crossed off are prime.

Obviously there are quite a few optimizations that can be done to make this algorithm work faster, but this is the basic idea.

The sieve of Atkin uses a similar approach, but unfortunately I don't know enough about it to explain it to you. But I do know that the algorithm I linked takes 8 seconds to figure out all the primes up to 1000000000 on an ancient Pentium II-350

Sieve of Eratosthenes Source Code: http://web.archive.org/web/20140705111241/http://primes.utm.edu/links/programs/sieves/Eratosthenes/C_source_code/

Sieve of Atkin Source Code: http://cr.yp.to/primegen.html

Solution 3

This isn't strictly against the hardcoding restriction, but comes terribly close. Why not programatically download this list and print it out, instead?

http://primes.utm.edu/lists/small/10000.txt

Solution 4

GateKiller, how about adding a break to that if in the foreach loop? That would speed up things a lot because if like 6 is divisible by 2 you don't need to check with 3 and 5. (I'd vote your solution up anyway if I had enough reputation :-) ...)

ArrayList primeNumbers = new ArrayList();

for(int i = 2; primeNumbers.Count < 10000; i++) {
    bool divisible = false;

    foreach(int number in primeNumbers) {
        if(i % number == 0) {
            divisible = true;
            break;
        }
    }

    if(divisible == false) {
        primeNumbers.Add(i);
        Console.Write(i + " ");
    }
}

Solution 5

In Haskell, we can write down almost word for word the mathematical definition of the sieve of Eratosthenes, "primes are natural numbers above 1 without any composite numbers, where composites are found by enumeration of each prime's multiples":

import Data.List.Ordered (minus, union)

primes = 2 : minus [3..] (foldr (\p r -> p*p : union [p*p+p, p*p+2*p..] r)
                                [] primes)

primes !! 10000 is near-instantaneous.

References:

_{The above code is easily tweaked into working on odds only, primes = 2 : 3 : minus [5,7..] (foldr (\p r -> p*p : union [p*p+2*p, p*p+4*p..] r) [] (tail primes)). Time complexity is much improved (to just about a log factor above optimal) by folding in a tree-like structure, and space complexity is drastically improved by multistage primes production, in}

primes = 2 : _Y ( (3:) . sieve 5 . _U . map (\p -> [p*p, p*p+2*p..]) )
  where
    _Y g = g (_Y g)                        -- non-sharing fixpoint combinator
    _U ((x:xs):t) = x : (union xs . _U . pairs) t       -- ~= nub.sort.concat
    pairs    (xs:ys:t)  = union xs ys : pairs t
    sieve k s@(x:xs) | k < x      = k : sieve (k+2) s   -- ~= [k,k+2..]\\s,
                     | otherwise  =     sieve (k+2) xs  --   when s⊂[k,k+2..]

_{(In Haskell the parentheses are used for grouping, a function call is signified just by juxtaposition, (:) is a cons operator for lists, and (.) is a functional composition operator: (f . g) x = (\y -> f (g y)) x = f (g x)).}

View more solutions

69,096

Author by

Niyaz

I hang out here.

Updated on July 08, 2022

Comments

Niyaz almost 2 years
I want to print the first 10000 prime numbers. Can anyone give me the most efficient code for this? Clarifications:
1. It does not matter if your code is inefficient for n >10000.
2. The size of the code does not matter.
3. You cannot just hard code the values in any manner.
jfs over 15 years

what is the time for lim=1_000_000 ? It can't be both `<1s' and '5s'.
jfs over 15 years

Name primes is misleading, in your code its meaning is_composite_number. You may eliminate the first loop if you replace malloc by calloc. Expression j+=i can overflow for large lim (and you'll miss some primes in that case).
jfs over 15 years

Sieve of Eratosthenes could be faster for small N. See my answer.
Imran over 15 years

Fixed. < 1s for 100,000, ~5s for 1,000,000 Thanks for suggesting calloc and the alternative array name. I also thought primes[] is quite confusing, but couldn't think of a better name.
Clayton over 15 years

You can speed that up if you exit that foreach loop when you get to number>squareroot(i).
Imran over 15 years

Replacing the loop with calloc now gets lim = 100,000,000 done in ~4s
Corey over 15 years

For most computers, calculating the values would be quicker than the latency involved in downloading them over the internet.
Daniel almost 15 years

Though this is a good answer both Sieves only generate primes in the range [2, N], rather than the first N primes.
mpen over 14 years

lol @krog. why would you bother to set up a network connection and all that jazz to DL a static file each time? of course you'd predownload it, heck, even hardcode it into an array.
Beska about 14 years

You can still speed this up considerably by also breaking if number > sqrt(i).
Gabe about 14 years

Daniel: the 10,000th prime is less than 105,000 so he just has to hardcode his sieve to go up to 105,000.
Jeff LaFay over 13 years

1min for 10000 is pretty slow. In C# (not utilizing parallel fors/foreaches), on average I get 13seconds up to 10,000,000. Using one parallel for I get on average 10seconds for the same bound.
Will Ness about 12 years

The needed upper limit is easy to estimate, with some spare, as m = n(log n + log (log n)), for n>= 6 (see wikipedia). Plus, the sieve of Eratosthenes can be reformulated as segmented, making it truly incremental.
Clint Eastwood about 10 years

this does not answer the question: he asked for the first N primes, not all the primes up to N...
Stephen C about 10 years

@Daniel - see also the Prime Number Theorem - specifically, en.wikipedia.org/wiki/… which gives theoretical lower and upper bounds for the Nth prime number.
user207421 over 9 years

This can be sped up by several orders of magnitude: by breaking when you set divisible = true, by only processing odd numbers beyond 2, and by any of several other techniques including the one mentioned by @Clayton. Certainly not the 'most efficient'.
qwr over 8 years

mpz_nextprime calculates primes one at a time, instead of all at once with a sieve
Will Ness about 8 years

have you come up with the algorithm yourself? -- could you add few comments about how is it working?
DarthGizka about 8 years

@Will: It's deque sieve. It is very elegant but fairly inefficient, and so it's mostly an intellectual curiosity like the Sieve of Atkin. Unlike the latter it can be actually useful on occasion, because it stays within the L1 cache much longer than a simple, unsegmented sieve and it is slightly simpler than an iterated segmented sieve (assuming that an adequate deque implementation is available). I could not find any good expositions on the 'net, so I'll write it up as a separate answer.
Will Ness about 8 years

I was hoping for a descriptive pseudocode, couldn't understand his code's idiosyncrasies (factors.resize(3) followed by int factor = factors.front();... don't see anything put into the deque, so what is he getting out of it?...). Will have to study your code, hopefully it'd be clearer and more straightforward.
DarthGizka about 8 years

@Will: resize(3) on an empty deque or vector has the effect of initialising it to three zeroes, just like my code does with the initialiser { 0, 0, 0 }. The easiest way of getting into the algorithm is mental symbolic execution for a few iterations, or loading it into LINQPad and debugging (i.e. watching it work). Hopefully my code should make this a bit easier than Ben's... Also: storing a factor in a given slot not only marks the slot as composite (i.e. a multiple of that factor); it is also the only place where that factor is stored, and it is implicitly the factor's 'working offset'.
DarthGizka about 8 years

@Will: I've added an explanation to the body of the text, right after the code of the deque sieve. If I did a poor job and you still have questions, just shoot!
Will Ness about 8 years

OK, got it now, thanks. For me, in Ben's code, "factors" ought to be renamed as "sieve"; and "maybe_prime" - "candidate"; also, inside main, "factor" -> "mult_of". Then it becomes readable. Description of "sieve" is: sequence of ints serving as sieve markings (corresponding to consecutive odd numbers) past the current candidate where a non-zero f indicates a multiple of f (e.g. {0,0,3} at 11, or {3,0,0,0,5} at 27). So it's the same as the usual sliding sieve, except it maintains the local sieve in this weird format, instead of just storing the actual numbers in a PriorityQueue.
Will Ness about 8 years

... perhaps it's done for efficiency's sake (PQ is perhaps under-performant, by comparison?... OTOH this sieve needs more space ). By "the usual sliding sieve" I mean e.g. this one, in Python. --- so, do you know of an origin, a source for this sieve? I asked this under Ben's answer too, but he's not responded yet. Is it well known, famous in some way?..
Will Ness about 8 years

related comment.
Will Ness about 8 years

of course, in Haskell, it is a truly illuminating one-liner 2 : fix ( (3:) . minus [5,7..] . unionAll . map (\p-> [p*p, p*p+2*p..]) ) (using a Data.List.Ordered module's minus and unionAll that is) with Haskell's laziness keeping it "local". Again, not overly-performant, but illuminating. :) Thanks again for the pointers.
DarthGizka about 8 years

Since this is marked as 'answer': -1 for talking about asymptotics when the OP wants efficiency for a small, fixed n of 10000. -1 for linking the Sieve of Atkin without even skimming it: the article itself mentions that Atkin is only asymptotically better than Eratosthenes but in practice it is always slower if equal effort is spent on implementation; for the OP's problem Eratosthenes is orders of magnitude faster with much simpler code. The remark about mod 6 wheels doesn't make much sense: Atkin is based on the mod 30 wheel already, so there's no way of speeding it up with a lesser wheel.
DarthGizka about 8 years

@Will: The deque sieve has a lot in common with Bennion's Hopping Sieve (more info including C source code at Will Galway's SieveStuff page), which was developed by Robert Bennion in the 1970s. In any case an interesting read for every connoisseur!
Will Ness about 8 years

great, thanks. I think I saw it mentioned once before, but didn't find the sources... the "hopping tiles"... is it "working by columns" so to speak, vs. the usual strict bounded sieve working "by rows"? or some combination of both?... --- there's also a different algo by Gries and Misra, which I am yet to process fully. :) (it ought to be a variation on Euler's scheme, somehow...)
Will Ness about 8 years

@DarthGizka I've coded its Haskell sort-of equivalent (the counting, minus the deque). It's at the bottom here. It's not very readable.
DarthGizka about 8 years

@Will: mere mortals like me can only stand in awe - no chance at decoding and understanding such a dense notation. However, it should be extremely helpful for people who want to adapt the algorithm to languages of a similar bent (i.e. of a functional or other non-imperative persuasion). In any case the algorithm is ideal for coding (lazy) generators, when performance requirements do not warrant the effort for a segmented sieve...
Will Ness about 8 years

@DarthGizka nah,it's pretty simple; repeats much of previous stuff from the haskellwiki prime numbers page. The new old thing is counting by ones: tail(cycle (1:replicate(p-1)0)). Instead of one deque into which new ps are (destructibly) inserted, have for each p its own (immutable) stream of 1,0,0,1,0,0,1,...(that's for 3), and smash them together withzipWiththe pairwise max 0 0 = 0; max 0 1 = 1; etc.), after skipping the prefix from one square of prime to the next prime's square. And [x|u==0] keeps the numbers for 0s that's still there.
Will Ness about 8 years

@DarthGizka (correction: it's 9,0,0,1,0,0,1,0,0,1,..., for 3; 25,0,0,0,0,1,0,0,..., for 5). --- the proper - still high-level - sieve of E. with wheels, is at the bottom of this section. Also, on Rosetta code (and a full exposition, above that entry). Counting by ones is pretty disadvantageous, with the immutable data structures of Haskell (or FP in general).
Will Ness about 8 years

you can introduce a new var count=0 and increment it every time your set a true mark[j] to false. that way you'll have the correct count in the end and could allocate the array directly, without first creating a list.
S_R almost 8 years

Yeah, optimizing the allocation on primes variable. Thanks!
BenGoldberg almost 8 years

Will Ness, I did not come up with the algorithm myself. It was originally on rosettacode.org/wiki/…
Redu over 7 years

Recently i have come up with a modified version of Sieve of Sundaram which seems to be twice as much performant compared to Sieve of Eratosthenes. I have implemented it in JS but I would love to see how it does in Haskell (which i am currently studying). May be you could check it up and include it in your answer if it is worthy.
Will Ness about 6 years

@Redu s. of Sundaram is supposed to be algorithmically inferior to the s. of E., being confined to just the odd numbers as it is, whereas the s. of E. can be easily amended to use the higher wheels, like {2,3,5,7}-coprimes or even higher. also, can it be segmented? this improvement is also crucial.
Redu about 6 years

This "modified" Sundaram sieving is just so cool. Have you read my answer in detail? Please spare some time when you can and read it. It's real fast and also segmentable. It turned out to be the fastest among all JS algorithms that i could have found. Honestly you might be the only person that i can rely on this society for a fair evaluation. :)
Will Ness about 6 years

@Redu you're right, I haven't read it yet. Now I'll have to do it! :)
Will Ness over 4 years

"and cross off all multiples" but how are those multiples found is a crucial issue, often gotten wrong so you'd end up with a trial division sieve instead, which is much worse than the sieve of Eratosthenes asymptotically, and in practice too for all but very small n.
ProfK over 2 years

Testing each potential prime with a regex seems to be terribly slow