Most efficient code for the first 10000 prime numbers?
Solution 1
The Sieve of Atkin is probably what you're looking for, its upper bound running time is O(N/log log N).
If you only run the numbers 1 more and 1 less than the multiples of 6, it could be even faster, as all prime numbers above 3 are 1 away from some multiple of six. Resource for my statement
Solution 2
I recommend a sieve, either the Sieve of Eratosthenes or the Sieve of Atkin.
The sieve or Eratosthenes is probably the most intuitive method of finding a list of primes. Basically you:
- Write down a list of numbers from 2 to whatever limit you want, let's say 1000.
- Take the first number that isn't crossed off (for the first iteration this is 2) and cross off all multiples of that number from the list.
- Repeat step 2 until you reach the end of the list. All the numbers that aren't crossed off are prime.
Obviously there are quite a few optimizations that can be done to make this algorithm work faster, but this is the basic idea.
The sieve of Atkin uses a similar approach, but unfortunately I don't know enough about it to explain it to you. But I do know that the algorithm I linked takes 8 seconds to figure out all the primes up to 1000000000 on an ancient Pentium II-350
Sieve of Eratosthenes Source Code: http://web.archive.org/web/20140705111241/http://primes.utm.edu/links/programs/sieves/Eratosthenes/C_source_code/
Sieve of Atkin Source Code: http://cr.yp.to/primegen.html
Solution 3
This isn't strictly against the hardcoding restriction, but comes terribly close. Why not programatically download this list and print it out, instead?
http://primes.utm.edu/lists/small/10000.txt
Solution 4
GateKiller, how about adding a break
to that if
in the foreach
loop? That would speed up things a lot because if like 6 is divisible by 2 you don't need to check with 3 and 5. (I'd vote your solution up anyway if I had enough reputation :-) ...)
ArrayList primeNumbers = new ArrayList();
for(int i = 2; primeNumbers.Count < 10000; i++) {
bool divisible = false;
foreach(int number in primeNumbers) {
if(i % number == 0) {
divisible = true;
break;
}
}
if(divisible == false) {
primeNumbers.Add(i);
Console.Write(i + " ");
}
}
Solution 5
In Haskell, we can write down almost word for word the mathematical definition of the sieve of Eratosthenes, "primes are natural numbers above 1 without any composite numbers, where composites are found by enumeration of each prime's multiples":
import Data.List.Ordered (minus, union)
primes = 2 : minus [3..] (foldr (\p r -> p*p : union [p*p+p, p*p+2*p..] r)
[] primes)
primes !! 10000
is near-instantaneous.
References:
The above code is easily tweaked into working on odds only, primes = 2 : 3 : minus [5,7..] (foldr (\p r -> p*p : union [p*p+2*p, p*p+4*p..] r) [] (tail primes))
. Time complexity is much improved (to just about a log factor above optimal) by folding in a tree-like structure, and space complexity is drastically improved by multistage primes production, in
primes = 2 : _Y ( (3:) . sieve 5 . _U . map (\p -> [p*p, p*p+2*p..]) )
where
_Y g = g (_Y g) -- non-sharing fixpoint combinator
_U ((x:xs):t) = x : (union xs . _U . pairs) t -- ~= nub.sort.concat
pairs (xs:ys:t) = union xs ys : pairs t
sieve k s@(x:xs) | k < x = k : sieve (k+2) s -- ~= [k,k+2..]\\s,
| otherwise = sieve (k+2) xs -- when s⊂[k,k+2..]
(In Haskell the parentheses are used for grouping, a function call is signified just by juxtaposition, (:)
is a cons operator for lists, and (.)
is a functional composition operator: (f . g) x = (\y -> f (g y)) x = f (g x)
).
Comments
-
Niyaz almost 2 years
I want to print the first 10000 prime numbers. Can anyone give me the most efficient code for this? Clarifications:
- It does not matter if your code is inefficient for n >10000.
- The size of the code does not matter.
- You cannot just hard code the values in any manner.
-
jfs over 15 yearswhat is the time for lim=1_000_000 ? It can't be both `<1s' and '5s'.
-
jfs over 15 yearsName
primes
is misleading, in your code its meaningis_composite_number
. You may eliminate the first loop if you replacemalloc
bycalloc
. Expressionj+=i
can overflow for largelim
(and you'll miss some primes in that case). -
jfs over 15 yearsSieve of Eratosthenes could be faster for small N. See my answer.
-
Imran over 15 yearsFixed. < 1s for 100,000, ~5s for 1,000,000 Thanks for suggesting
calloc
and the alternative array name. I also thought primes[] is quite confusing, but couldn't think of a better name. -
Clayton over 15 yearsYou can speed that up if you exit that foreach loop when you get to number>squareroot(i).
-
Imran over 15 yearsReplacing the loop with
calloc
now gets lim = 100,000,000 done in ~4s -
Corey over 15 yearsFor most computers, calculating the values would be quicker than the latency involved in downloading them over the internet.
-
Daniel almost 15 yearsThough this is a good answer both Sieves only generate primes in the range [2, N], rather than the first N primes.
-
mpen over 14 yearslol @krog. why would you bother to set up a network connection and all that jazz to DL a static file each time? of course you'd predownload it, heck, even hardcode it into an array.
-
Beska about 14 yearsYou can still speed this up considerably by also breaking if number > sqrt(i).
-
Gabe about 14 yearsDaniel: the 10,000th prime is less than 105,000 so he just has to hardcode his sieve to go up to 105,000.
-
Jeff LaFay over 13 years1min for 10000 is pretty slow. In C# (not utilizing parallel fors/foreaches), on average I get 13seconds up to 10,000,000. Using one parallel for I get on average 10seconds for the same bound.
-
Will Ness about 12 yearsThe needed upper limit is easy to estimate, with some spare, as
m = n(log n + log (log n))
, forn>= 6
(see wikipedia). Plus, the sieve of Eratosthenes can be reformulated as segmented, making it truly incremental. -
Clint Eastwood about 10 yearsthis does not answer the question: he asked for the first N primes, not all the primes up to N...
-
Stephen C about 10 years@Daniel - see also the Prime Number Theorem - specifically, en.wikipedia.org/wiki/… which gives theoretical lower and upper bounds for the Nth prime number.
-
user207421 over 9 yearsThis can be sped up by several orders of magnitude: by breaking when you set
divisible = true
, by only processing odd numbers beyond 2, and by any of several other techniques including the one mentioned by @Clayton. Certainly not the 'most efficient'. -
qwr over 8 years
mpz_nextprime
calculates primes one at a time, instead of all at once with a sieve -
Will Ness about 8 yearshave you come up with the algorithm yourself? -- could you add few comments about how is it working?
-
DarthGizka about 8 years@Will: It's deque sieve. It is very elegant but fairly inefficient, and so it's mostly an intellectual curiosity like the Sieve of Atkin. Unlike the latter it can be actually useful on occasion, because it stays within the L1 cache much longer than a simple, unsegmented sieve and it is slightly simpler than an iterated segmented sieve (assuming that an adequate deque implementation is available). I could not find any good expositions on the 'net, so I'll write it up as a separate answer.
-
Will Ness about 8 yearsI was hoping for a descriptive pseudocode, couldn't understand his code's idiosyncrasies (
factors.resize(3)
followed byint factor = factors.front();
... don't see anything put into the deque, so what is he getting out of it?...). Will have to study your code, hopefully it'd be clearer and more straightforward. -
DarthGizka about 8 years@Will:
resize(3)
on an empty deque or vector has the effect of initialising it to three zeroes, just like my code does with the initialiser{ 0, 0, 0 }
. The easiest way of getting into the algorithm is mental symbolic execution for a few iterations, or loading it into LINQPad and debugging (i.e. watching it work). Hopefully my code should make this a bit easier than Ben's... Also: storing a factor in a given slot not only marks the slot as composite (i.e. a multiple of that factor); it is also the only place where that factor is stored, and it is implicitly the factor's 'working offset'. -
DarthGizka about 8 years@Will: I've added an explanation to the body of the text, right after the code of the deque sieve. If I did a poor job and you still have questions, just shoot!
-
Will Ness about 8 yearsOK, got it now, thanks. For me, in Ben's code, "factors" ought to be renamed as "sieve"; and "maybe_prime" - "candidate"; also, inside
main
, "factor" -> "mult_of". Then it becomes readable. Description of "sieve" is: sequence of ints serving as sieve markings (corresponding to consecutive odd numbers) past the current candidate where a non-zerof
indicates a multiple off
(e.g. {0,0,3} at 11, or {3,0,0,0,5} at 27). So it's the same as the usual sliding sieve, except it maintains the local sieve in this weird format, instead of just storing the actual numbers in a PriorityQueue. -
Will Ness about 8 years... perhaps it's done for efficiency's sake (PQ is perhaps under-performant, by comparison?... OTOH this sieve needs more space ). By "the usual sliding sieve" I mean e.g. this one, in Python. --- so, do you know of an origin, a source for this sieve? I asked this under Ben's answer too, but he's not responded yet. Is it well known, famous in some way?..
-
Will Ness about 8 years
-
Will Ness about 8 yearsof course, in Haskell, it is a truly illuminating one-liner
2 : fix ( (3:) . minus [5,7..] . unionAll . map (\p-> [p*p, p*p+2*p..]) )
(using aData.List.Ordered
module'sminus
andunionAll
that is) with Haskell's laziness keeping it "local". Again, not overly-performant, but illuminating. :) Thanks again for the pointers. -
DarthGizka about 8 yearsSince this is marked as 'answer': -1 for talking about asymptotics when the OP wants efficiency for a small, fixed n of 10000. -1 for linking the Sieve of Atkin without even skimming it: the article itself mentions that Atkin is only asymptotically better than Eratosthenes but in practice it is always slower if equal effort is spent on implementation; for the OP's problem Eratosthenes is orders of magnitude faster with much simpler code. The remark about mod 6 wheels doesn't make much sense: Atkin is based on the mod 30 wheel already, so there's no way of speeding it up with a lesser wheel.
-
DarthGizka about 8 years@Will: The deque sieve has a lot in common with Bennion's Hopping Sieve (more info including C source code at Will Galway's SieveStuff page), which was developed by Robert Bennion in the 1970s. In any case an interesting read for every connoisseur!
-
Will Ness about 8 yearsgreat, thanks. I think I saw it mentioned once before, but didn't find the sources... the "hopping tiles"... is it "working by columns" so to speak, vs. the usual strict bounded sieve working "by rows"? or some combination of both?... --- there's also a different algo by Gries and Misra, which I am yet to process fully. :) (it ought to be a variation on Euler's scheme, somehow...)
-
Will Ness about 8 years@DarthGizka I've coded its Haskell sort-of equivalent (the counting, minus the deque). It's at the bottom here. It's not very readable.
-
DarthGizka about 8 years@Will: mere mortals like me can only stand in awe - no chance at decoding and understanding such a dense notation. However, it should be extremely helpful for people who want to adapt the algorithm to languages of a similar bent (i.e. of a functional or other non-imperative persuasion). In any case the algorithm is ideal for coding (lazy) generators, when performance requirements do not warrant the effort for a segmented sieve...
-
Will Ness about 8 years@DarthGizka nah,it's pretty simple; repeats much of previous stuff from the haskellwiki prime numbers page. The new old thing is counting by ones:
tail(cycle (1:replicate(p-1)0))
. Instead of one deque into which newp
s are (destructibly) inserted, have for eachp
its own (immutable) stream of1,0,0,1,0,0,1,...
(that's for 3), and smash them together withzipWith
the pairwisemax 0 0 = 0; max 0 1 = 1;
etc.), after skipping the prefix from one square of prime to the next prime's square. And[x|u==0]
keeps the numbers for 0s that's still there. -
Will Ness about 8 years@DarthGizka (correction: it's
9,0,0,1,0,0,1,0,0,1,...
, for 3;25,0,0,0,0,1,0,0,...
, for 5). --- the proper - still high-level - sieve of E. with wheels, is at the bottom of this section. Also, on Rosetta code (and a full exposition, above that entry). Counting by ones is pretty disadvantageous, with the immutable data structures of Haskell (or FP in general). -
Will Ness about 8 yearsyou can introduce a new var
count=0
and increment it every time your set a truemark[j]
tofalse
. that way you'll have the correct count in the end and could allocate the array directly, without first creating a list. -
S_R almost 8 yearsYeah, optimizing the allocation on primes variable. Thanks!
-
BenGoldberg almost 8 yearsWill Ness, I did not come up with the algorithm myself. It was originally on rosettacode.org/wiki/…
-
Redu over 7 yearsRecently i have come up with a modified version of Sieve of Sundaram which seems to be twice as much performant compared to Sieve of Eratosthenes. I have implemented it in JS but I would love to see how it does in Haskell (which i am currently studying). May be you could check it up and include it in your answer if it is worthy.
-
Will Ness about 6 years@Redu s. of Sundaram is supposed to be algorithmically inferior to the s. of E., being confined to just the odd numbers as it is, whereas the s. of E. can be easily amended to use the higher wheels, like {2,3,5,7}-coprimes or even higher. also, can it be segmented? this improvement is also crucial.
-
Redu about 6 yearsThis "modified" Sundaram sieving is just so cool. Have you read my answer in detail? Please spare some time when you can and read it. It's real fast and also segmentable. It turned out to be the fastest among all JS algorithms that i could have found. Honestly you might be the only person that i can rely on this society for a fair evaluation. :)
-
Will Ness about 6 years@Redu you're right, I haven't read it yet. Now I'll have to do it! :)
-
Will Ness over 4 years"and cross off all multiples" but how are those multiples found is a crucial issue, often gotten wrong so you'd end up with a trial division sieve instead, which is much worse than the sieve of Eratosthenes asymptotically, and in practice too for all but very small
n
. -
ProfK over 2 yearsTesting each potential prime with a regex seems to be terribly slow