Factorial of a large number in python

23,444

Solution 1

Multiplying the numbers in sequence,

r = 1
for i in range(1, n + 1):
    r *= i
return r

creates a large number (as in tens of thousands of bits) very quickly, and then you have a lot of multiplications of one huge number and one small number. Multiplications where at least one of the factors is huge are slow.

You can speed it up considerably by reducing the number of multiplications involving huge numbers, for example

def range_prod(lo,hi):
    if lo+1 < hi:
        mid = (hi+lo)//2
        return range_prod(lo,mid) * range_prod(mid+1,hi)
    if lo == hi:
        return lo
    return lo*hi

def treefactorial(n):
    if n < 2:
        return 1
    return range_prod(1,n)

produces, timing the computation of 100000! % 100019 (I first tried len(str(fun(100000)), but the conversion to string is abominably slow, so that made the difference seem smaller than it is):

$ python factorial.py 
81430
math.factorial took 4.06193709373 seconds
81430
factorial took 3.84716391563 seconds
81430
treefactorial took 0.344486951828 seconds

so a more than 10× speedup for 100000!.

Solution 2

Factorials get very large, so it is often better to deal with logarithms of the number.

Many languages have an lgamma library function which computes the natural logarithm of the factorial of n-1.

This means that you can compute the natural logarithm of factorial(n) via lgamma(n+1).

You can divide by log10 to turn this into a base 10 logarithm.

So if you just want the number of digits, then this Python code will give the answer immediately:

from math import *
print ceil(lgamma(100000+1)/log(10))

Solution 3

If you need a short execution time and don't need the best possible accuracy, you can use an approximation formula, e.g. Stirling approximation

Solution 4

If you just need an approximation, Ramanujan's factorial approximation is supposed to be more accurate than Stirling's.

If you need (or want) something precise, you might try GMP, the GNU Multiple Precision library. I've used it successfully for primality testing of large numbers in Python.

Solution 5

It's actually unusual to really need the true factorial value n! in many areas of application. Often it's way more realistic to use the natural log of the factorial. I can't think of any applications where the log can't be used as a better alternative, because factorials are most often used to compute values related to probabilities of choosing combinations of things.

A common thing to compute is probabilities that are based on factorials such as choosing the binomial coefficient (n k) = n! / (k!(n-k)!). Given this is a ratio of factorials then log(n k) = log(n!)-log(k!)-log((n-k)!) which is reliably computed using one of the various log factorial approximations. And if you do a lot of probability math it's generally best to do it in the log domain anyway (measuring probability in decibels) because it often involves extremely wide ranges of numbers less than 1, and so math precision will fall apart very quickly using common floating point representations if the log version is not used.

E.T.Jaynes was a famous mathematician and an expert in probability theory and I'd recommend his book "Probability Theory: The Logic of Science" as a very readable source on this topic and Bayesian reasoning and information theory using log probabilities.

Share:
23,444
Juan José Castro
Author by

Juan José Castro

Updated on January 12, 2021

Comments

  • Juan José Castro
    Juan José Castro over 3 years

    Here's my approach to factorials:

    def factorial(n):
        '''Returns factorial of n'''
        r = 1
        for i in range(1, n + 1):
            r *= i
        return r
    

    I think it's pretty straightforward, though I guess you could make something more efficient, because it takes ages for large numbers like 100000. My question is, is there? math.factorial() is no good either, it takes roughly the same amount of time.

  • Juan José Castro
    Juan José Castro about 11 years
    Well, what i need is to get the length of the factorial, but before I can do that I need to generate the number (or at least that's what I think). len(str(factorial(100000))) does in fact return 456574, after roughly a minute.
  • kindall
    kindall about 11 years
    logamma is in Python 2.7.x and in Python 3.2+ (but not in 3.0.x or 3.1.x).
  • Daniel Fischer
    Daniel Fischer about 11 years
    "It could also well pay to do multiplications in a different order, with intermediate results, such that you end up with multiplying only a few very big numbers at the end" I've tried that, in Haskell I get a big speedup from that, but my attempts in Python haven't made a big difference. Not sure why.
  • kindall
    kindall about 11 years
    math.gamma(100000) unfortunately raises OverflowError; the function only works for values up to about 170.
  • Daniel Fischer
    Daniel Fischer about 11 years
    Duh! It's not the algorithm, it's the conversion to string that is so abominably slow!
  • Staven
    Staven about 11 years
    Fun fact: looking at the sources, Python uses the Lanczos approximation for the gamma function (Modules/mathmodules.c).
  • endolith
    endolith about 8 years
    Are you ok with this being used in BSD-licensed SciPy? github.com/scipy/scipy/pull/6020
  • Daniel Fischer
    Daniel Fischer about 8 years
    @endolith Sure. I don't take any responsibility for errors in the code, though. Nor can I claim any originality regarding the algorithm, it's old.
  • endolith
    endolith about 8 years
    So this calculates 9! as ((2*3)*(4*5))*((6*7)*(8*9))? It might be better to use((2*9)*(3*8))*((4*7)*(5*6)) to keep them smaller for longer?
  • Daniel Fischer
    Daniel Fischer about 8 years
    @endolith That doesn't make much difference in terms of the size of the intermediate products. When I tested (long ago, in Haskell), I couldn't get that scheme to be faster than the above, since the code becomes more complicated. Do test, it might buy you a little, but don't expect too much from it.
  • Peter O.
    Peter O. about 5 years
    If you can, you should point to a link to "log factorial approximations", since they are the most relevant in your answer to the question of calculating the factorial of a large number.
  • denis
    denis over 4 years
    Some very readable pages of the preface of Jaynes's book are on books.google.com/books?isbn=0521592712
  • Cees Timmerman
    Cees Timmerman almost 4 years
    Doesn't appear to make a difference in Python 3.7.8 for 1000000.