# Factorials with prime factorization (Python)

**Other implementations**: C |**Python**

The *n*th factorial, *n*!, can be computed straightforwardly by multiplying together the range of integers 1, 2, ... *n* from the bottom up. However, if *n* is large, this becomes inefficient. The *k*:th subproduct has roughly *k* log *k* digits, so multiplying by *k*+1 (assumed to fit in a single machine word) takes *k* log *k* time. This leads to a total time of O(*n*^{2} log *n*) for calculating *n*!.

Many algorithms exist for splitting a product of many factors into smaller pieces to balance the size of each subproduct. However, there is a trick to factorials: we can find the *prime factorization* of *n*! quickly, much more quickly than we can compute *n*! itself. And we can do this without stepping through the list 1, 2, ..., *n* and factorizing each of these numbers. The crucial observation is that *n*! must necessarily contain each prime number up to *n*, and that the power of the prime *p* in *n*! is given by

- .

Notice that this is actually a finite sum, since the floor is zero whenever *p*^{i} > *n*. This implementation is based on the following references:

- Peter Luschny.
*Fast Factorial Functions* - Peter Borwein. "On the Complexity of Calculating Factorials".
*Journal of Algorithms*6, 376-380 (1985) - Xavier Gourdon & Pascal Sebah.
*Binary splitting method*

In Python, the sum can be implemented as follows:

<<prime_factorial.py>>=defmultiplicity(n, p): """Return the power of the prime number p in the factorization of n!"""ifp > n:return0ifp > n//2:return1 q, m = n, 0whileq >= p: q //= p m += qreturnm

We now need a way to obtain the prime numbers up to *n*, for which purpose we implement a light-weight sieve of Eratosthenes:

<<prime_factorial.py>>=defprimes(n): """Generate a list of the prime numbers [2, 3, ... m] where m is the largest prime <= n.""" n = n + 1 sieve = range(n) sieve[:2] = [0, 0]foriinxrange(2, int(n**0.5)+1):ifsieve[i]:forjinxrange(i**2, n, i): sieve[j] = 0 # Filter out the composites, which have been replaced by 0'sreturn[pforpinsieveifp]

Knowing the prime factors of *n*!, and the exponent for each factor, all that needs to be done is taking powers. This is also the most important step of the calculation: if we just multiply all the primes together without further thought, no time will be saved. What we can do instead is perform recursive *exponentiation by squaring*. For exponentiation of a single integer, this is based on the fact that squaring the number cuts the remaining exponent in half. The technique can easily be applied to a list of numbers as well:

<<prime_factorial.py>>=defpowproduct(ns): """Compute the explicit value of a factored integer given as a list of (base, exponent) pairs."""ifnotns:return1 units = 1 multi = []forbase, expinns:ifexp == 0:continueelifexp == 1: units *= baseelse:ifexp % 2: units *= base multi.append((base, exp//2))returnunits * powproduct(multi)**2

Wrapping things together, we have:

<<prime_factorial.py>>=deffactorial(n):returnpowproduct((p, multiplicity(n, p))forpinprimes(n))

In practice, this program will perform worse than one based on direct multiplication when *n* is small. A practical implementation should use an empirically determined cutoff to enable prime factorization only for very large *n*, say, larger than a few hundred.

Many optimizations are also possible. The most obvious is to compute a large table of primes in advance instead of creating a list for each call to *factorial*, another is to handle powers of two using binary shifting.

Download code |