Fast way to calculate n! mod m where m is prime?

32,644

Solution 1

Expanding my comment to an answer:

Yes, there are more efficient ways to do this. But they are extremely messy.

So unless you really need that extra performance, I don't suggest to try to implement these.


The key is to note that the modulus (which is essentially a division) is going to be the bottleneck operation. Fortunately, there are some very fast algorithms that allow you to perform modulus over the same number many times.

These methods are fast because they essentially eliminate the modulus.


Those methods alone should give you a moderate speedup. To be truly efficient, you may need to unroll the loop to allow for better IPC:

Something like this:

ans0 = 1
ans1 = 1
for i in range(1,(n+1) / 2):
    ans0 = ans0 * (2*i + 0) % modulus    
    ans1 = ans1 * (2*i + 1) % modulus    

return ans0 * ans1 % modulus

but taking into account for an odd # of iterations and combining it with one of the methods I linked to above.

Some may argue that loop-unrolling should be left to the compiler. I will counter-argue that compilers are currently not smart enough to unroll this particular loop. Have a closer look and you will see why.


Note that although my answer is language-agnostic, it is meant primarily for C or C++.

Solution 2

n can be arbitrarily large

Well, n can't be arbitrarily large - if n >= m, then n! ≡ 0 (mod m) (because m is one of the factors, by the definition of factorial).


Assuming n << m and you need an exact value, your algorithm can't get any faster, to my knowledge. However, if n > m/2, you can use the following identity (Wilson's theorem - Thanks @Daniel Fischer!)

(image)

to cap the number of multiplications at about m-n

(m-1)! ≡ -1 (mod m)
1 * 2 * 3 * ... * (n-1) * n * (n+1) * ... * (m-2) * (m-1) ≡ -1 (mod m)
n! * (n+1) * ... * (m-2) * (m-1) ≡ -1 (mod m)
n! ≡ -[(n+1) * ... * (m-2) * (m-1)]-1 (mod m)

This gives us a simple way to calculate n! (mod m) in m-n-1 multiplications, plus a modular inverse:

def factorialMod(n, modulus):
    ans=1
    if n <= modulus//2:
        #calculate the factorial normally (right argument of range() is exclusive)
        for i in range(1,n+1):
            ans = (ans * i) % modulus   
    else:
        #Fancypants method for large n
        for i in range(n+1,modulus):
            ans = (ans * i) % modulus
        ans = modinv(ans, modulus)
        ans = -1*ans + modulus
    return ans % modulus

We can rephrase the above equation in another way, that may or may-not perform slightly faster. Using the following identity:

(image)

we can rephrase the equation as

n! ≡ -[(n+1) * ... * (m-2) * (m-1)]-1 (mod m)
n! ≡ -[(n+1-m) * ... * (m-2-m) * (m-1-m)]-1 (mod m)
       (reverse order of terms)
n! ≡ -[(-1) * (-2) * ... * -(m-n-2) * -(m-n-1)]-1 (mod m)
n! ≡ -[(1) * (2) * ... * (m-n-2) * (m-n-1) * (-1)(m-n-1)]-1 (mod m)
n! ≡ [(m-n-1)!]-1 * (-1)(m-n) (mod m)

This can be written in Python as follows:

def factorialMod(n, modulus):
    ans=1
    if n <= modulus//2:
        #calculate the factorial normally (right argument of range() is exclusive)
        for i in range(1,n+1):
            ans = (ans * i) % modulus   
    else:
        #Fancypants method for large n
        for i in range(1,modulus-n):
            ans = (ans * i) % modulus
        ans = modinv(ans, modulus)

        #Since m is an odd-prime, (-1)^(m-n) = -1 if n is even, +1 if n is odd
        if n % 2 == 0:
            ans = -1*ans + modulus
    return ans % modulus

If you don't need an exact value, life gets a bit easier - you can use Stirling's approximation to calculate an approximate value in O(log n) time (using exponentiation by squaring).


Finally, I should mention that if this is time-critical and you're using Python, try switching to C++. From personal experience, you should expect about an order-of-magnitude increase in speed or more, simply because this is exactly the sort of CPU-bound tight-loop that natively-compiled code excels at (also, for whatever reason, GMP seems much more finely-tuned than Python's Bignum).

Solution 3

n! mod m can be computed in O(n1/2 + ε) operations instead of the naive O(n). This requires use of FFT polynomial multiplication, and is only worthwhile for very large n, e.g. n > 104.

An outline of the algorithm and some timings can be seen here: http://fredrikj.net/blog/2012/03/factorials-mod-n-and-wilsons-theorem/

Solution 4

If we want to calculate M = a*(a+1) * ... * (b-1) * b (mod p), we can use the following approach, if we assume we can add, substract and multiply fast (mod p), and get a running time complexity of O( sqrt(b-a) * polylog(b-a) ).

For simplicity, assume (b-a+1) = k^2, is a square. Now, we can divide our product into k parts, i.e. M = [a*..*(a+k-1)] *...* [(b-k+1)*..*b]. Each of the factors in this product is of the form p(x)=x*..*(x+k-1), for appropriate x.

By using a fast multiplication algorithm of polynomials, such as Schönhage–Strassen algorithm, in a divide & conquer manner, one can find the coefficients of the polynomial p(x) in O( k * polylog(k) ). Now, apparently there is an algorithm for substituting k points in the same degree-k polynomial in O( k * polylog(k) ), which means, we can calculate p(a), p(a+k), ..., p(b-k+1) fast.

This algorithm of substituting many points into one polynomial is described in the book "Prime numbers" by C. Pomerance and R. Crandall. Eventually, when you have these k values, you can multiply them in O(k) and get the desired value.

Note that all of our operations where taken (mod p). The exact running time is O(sqrt(b-a) * log(b-a)^2 * log(log(b-a))).

Solution 5

Expanding on my comment, this takes about 50% of the time for all n in [100, 100007] where m=(117 | 1117):

Function facmod(n As Integer, m As Integer) As Integer
    Dim f As Integer = 1
    For i As Integer = 2 To n
        f = f * i
        If f > m Then
            f = f Mod m
        End If
    Next
    Return f
End Function
Share:
32,644

Related videos on Youtube

John Smith
Author by

John Smith

Updated on July 09, 2022

Comments

  • John Smith
    John Smith almost 2 years

    I was curious if there was a good way to do this. My current code is something like:

    def factorialMod(n, modulus):
        ans=1
        for i in range(1,n+1):
            ans = ans * i % modulus    
        return ans % modulus
    

    But it seems quite slow!

    I also can't calculate n! and then apply the prime modulus because sometimes n is so large that n! is just not feasible to calculate explicitly.

    I also came across http://en.wikipedia.org/wiki/Stirling%27s_approximation and wonder if this can be used at all here in some way?

    Or, how might I create a recursive, memoized function in C++?

    • Fred Foo
      Fred Foo about 12 years
      How slow is slow? From your pseudocode, I infer you're computing this in Python, is that right?
    • John Smith
      John Smith about 12 years
      Any language, really; it's pretty much the same in C++ in terms of syntax. I chose Python here because it's easy to read. Even in C++, though, I need a faster function.
    • Mysticial
      Mysticial about 12 years
      There's a very fast way to do this using invariant multiplication or possibly Montgomery reduction. Both methods eliminate the modulus and will allow for loop-unrolling techniques.
    • davin
      davin about 12 years
      You can break down modulus into prime factors to identify cases that will be zero more easily, although that won't help for large prime factors - how helpful this is depends on what you know about the modulus, if anything, and if prime factorisation tickles your fancy.
    • Andrew Morton
      Andrew Morton about 12 years
      You can shave a bit of time off by only doing the mod if ans > modulus (credit: tech.groups.yahoo.com/group/primenumbers/messages/… )
    • John Smith
      John Smith about 12 years
      @kilotaras: Yes, M is prime (sorry, did not know this was relevant; will change post)
    • BlueRaja - Danny Pflughoeft
      BlueRaja - Danny Pflughoeft about 12 years
      @John: I've improved my answer, and included some Python code. This version should improve performance greatly when n >> m/2 *(performance will be the same when n <= m/2). I really don't see how memoization could be implemented with this problem - there is nothing to memoize.
    • Thomas Ahle
      Thomas Ahle about 11 years
      If n! mod m could be calculated in polynomial time (in the number of bits), factoring would be in P. At least so they say on this complexity blog: rjlipton.wordpress.com/2013/05/06/a-most-perplexing-mystery
    • Alex M.
      Alex M. over 9 years
      In fact, the link that @ThomasAhle meant to give is: rjlipton.wordpress.com/2009/02/23/factoring-and-factorials . As a side note, I think that the ideas expressed there are flawed, as noticed in a comment on that page. No wonder there is no other mention of this fact on the whole internet.
  • John Smith
    John Smith about 12 years
    however, for large n, calculating n! and then performing mod is not feasible
  • cdeszaq
    cdeszaq about 12 years
    Not feasible...why? Due to memory constraints? From the question, speed was the issue, not memory. If you are looking to have as small a memory footprint as possible and then optimize for speed, please update your question to reflect this.
  • John Smith
    John Smith about 12 years
    Is there usually a tradeoff between speed and memory when it comes to factorials? I'll update the question either way
  • cdeszaq
    cdeszaq about 12 years
    @JohnSmith - There's always a speed/memory trade-off with any non-trivial calculation. I have updated my answer with some other non-approximate ways of calculating the values in a less time-dependent way. Typically, you run into issues dealing with gigantic numbers when you deal with factorials, which can lead to memory problems in some cases, but there are ways of handling it. (See Java's BigInt class)
  • sdcvvc
    sdcvvc about 12 years
    -1 Computing n! and then mod is very slow, please try to compute 2000000! mod 5250307 that way. OP is doing it better in the question, you should interleave multiplication and taking modulo.
  • Mysticial
    Mysticial about 12 years
    It might be nice to get a comment from whoever just downvoted the 3 top ansewrs.
  • BlueRaja - Danny Pflughoeft
    BlueRaja - Danny Pflughoeft about 12 years
    @cdeszaq: What you seem to be missing is that multiplying two extremely large numbers (larger than the size of a register) is not O(1) on a computer: it's closer to O(m log m) (m = #bits). Multiplying two m-bit numbers results in (m+m)-bits, so your method takes approximately m log(m) + 2m log(m) + 3m log(m) + ... + nm log(m) = nm log(m)(n+1)/2 = O(mn^2 log(m)) operations. Taking a modulus after each operation, however, would result in about 2(m log (m)) + 2(m log(m)) + ...n additions... + 2(m log(m)) = 2mn log(m) = O(mn log(m)) which is significantly faster, even for small n.
  • John Smith
    John Smith about 12 years
    How might recursion + memoization be done in C++ for factoral mod m?
  • Mysticial
    Mysticial about 12 years
    @JohnSmith TBH, Memoization is probably not going to help at all - there's nothing to memoize. The only way it might become helpful is if you try the prime-factorization approach and use the windowing algorithm for exponentiation by squaring. (The windowing algorithm is a memoization algorithm.) But prime factorizing all integers from 1 to n will probably be slower than your current algorithm.
  • John Smith
    John Smith about 12 years
    Well in my case I am iterating from low n to high n, so doesn't that mean I can save time by storing values I've already calculated? For large n it seems like it'd save a lot of time by only doing a couple iterations rather than go from i=1 to n or n/2
  • Mysticial
    Mysticial about 12 years
    Well... There's nothing to "save". Knowing a couple iterations won't help you with the rest of them.
  • John Smith
    John Smith about 12 years
    But if I am calculating a bunch of numbers on the order of something like (20 million)! and greater, don't these iterations matter?
  • John Smith
    John Smith about 12 years
    I'll just save this as answer and make another question
  • Mysticial
    Mysticial about 12 years
    Ah ok, I was going to ask to clarify what you mean by "iterations matter". But I'll wait for your new question.
  • Daniel Fischer
    Daniel Fischer about 12 years
    "Thus, when m/2 < n < m, you only need to calculate (m/2)! * (-2)^(n-m/2-1) (mod m)" You can do better then. By Wilson's theorem, (m-1)! ≡ -1 (mod m) if m is prime. Now (m-1)! = n! * (m - (m-n-1)) * ... * (m - 1) ≡ (-1)^(m-n-1) * n! * (m-n-1)! (mod m), so n! ≡ (-1)^(m-n) * ((m-n-1)!)^(-1) (mod m). So you need to calculate (m-n-1)! mod m, find its modular inverse (O(log m) steps), and adjust the sign if necessary. Not much difference when n is close to m/2, but nice when n > 3m/4 or so.
  • BlueRaja - Danny Pflughoeft
    BlueRaja - Danny Pflughoeft about 12 years
    @DanielFischer: Thanks! I've included that in the answer.
  • ohad
    ohad over 9 years
    The algorithm of "substituting many points into one polynomial" is described also in the well known book "introduction to algorithms" by H. Cormen and others (in the FFT chapter).
  • Douglas Zare
    Douglas Zare about 9 years
    This is not helpful. BlueRaja-Danny-Pflughoeft already mentioned Wilson's theorem, and it doesn't do much because you can't count on needing just (m-1)!, or (m-k)! for small k, which his answer covered but yours didn't.
  • Bogdan Alexandru
    Bogdan Alexandru about 9 years
    Computing n! for very large n is not only slow, but quite impossible because the numbers get so large you can't address them any more.
  • Mercado
    Mercado over 7 years
    It's exactly the same as the naive algorithm implemented as a recursive function.
  • Robin Houston
    Robin Houston over 6 years
    This is a much better answer than the accepted answer.
  • Spencer
    Spencer about 2 years
    I cam3 to this question wondering about the specific case where m=4k+1 and n=2k, k possibly VERY large.