I need to compute combinatorials (nCr) in Python but cannot find the function to do that in math
, numpy
or stat
libraries. Something
Why not write it yourself? It's a one-liner or such:
from operator import mul # or mul=lambda x,y:x*y
from fractions import Fraction
def nCk(n,k):
return int( reduce(mul, (Fraction(n-i, i+1) for i in range(k)), 1) )
Test - printing Pascal's triangle:
>>> for n in range(17):
... print ' '.join('%5d'%nCk(n,k) for k in range(n+1)).center(100)
...
1
1 1
1 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
1 8 28 56 70 56 28 8 1
1 9 36 84 126 126 84 36 9 1
1 10 45 120 210 252 210 120 45 10 1
1 11 55 165 330 462 462 330 165 55 11 1
1 12 66 220 495 792 924 792 495 220 66 12 1
1 13 78 286 715 1287 1716 1716 1287 715 286 78 13 1
1 14 91 364 1001 2002 3003 3432 3003 2002 1001 364 91 14 1
1 15 105 455 1365 3003 5005 6435 6435 5005 3003 1365 455 105 15 1
1 16 120 560 1820 4368 8008 11440 12870 11440 8008 4368 1820 560 120 16 1
>>>
PS. edited to replace int(round(reduce(mul, (float(n-i)/(i+1) for i in range(k)), 1)))
with int(reduce(mul, (Fraction(n-i, i+1) for i in range(k)), 1))
so it won't err for big N/K
If you want exact results and speed, try gmpy -- gmpy.comb
should do exactly what you ask for, and it's pretty fast (of course, as gmpy
's original author, I am biased;-).
A literal translation of the mathematical definition is quite adequate in a lot of cases (remembering that Python will automatically use big number arithmetic):
from math import factorial
def calculate_combinations(n, r):
return factorial(n) // factorial(r) // factorial(n-r)
For some inputs I tested (e.g. n=1000 r=500) this was more than 10 times faster than the one liner reduce
suggested in another (currently highest voted) answer. On the other hand, it is out-performed by the snippit provided by @J.F. Sebastian.
Here's another alternative. This one was originally written in C++, so it can be backported to C++ for a finite-precision integer (e.g. __int64). The advantage is (1) it involves only integer operations, and (2) it avoids bloating the integer value by doing successive pairs of multiplication and division. I've tested the result with Nas Banov's Pascal triangle, it gets the correct answer:
def choose(n,r):
"""Computes n! / (r! (n-r)!) exactly. Returns a python long int."""
assert n >= 0
assert 0 <= r <= n
c = 1L
denom = 1
for (num,denom) in zip(xrange(n,n-r,-1), xrange(1,r+1,1)):
c = (c * num) // denom
return c
Rationale: To minimize the # of multiplications and divisions, we rewrite the expression as
n! n(n-1)...(n-r+1)
--------- = ----------------
r!(n-r)! r!
To avoid multiplication overflow as much as possible, we will evaluate in the following STRICT order, from left to right:
n / 1 * (n-1) / 2 * (n-2) / 3 * ... * (n-r+1) / r
We can show that integer arithmatic operated in this order is exact (i.e. no roundoff error).
Using dynamic programming, the time complexity is Θ(n*m) and space complexity Θ(m):
def binomial(n, k):
""" (int, int) -> int
| c(n-1, k-1) + c(n-1, k), if 0 < k < n
c(n,k) = | 1 , if n = k
| 1 , if k = 0
Precondition: n > k
>>> binomial(9, 2)
36
"""
c = [0] * (n + 1)
c[0] = 1
for i in range(1, n + 1):
c[i] = 1
j = i - 1
while j > 0:
c[j] += c[j - 1]
j -= 1
return c[k]
The direct formula produces big integers when n is bigger than 20.
So, yet another response:
from math import factorial
reduce(long.__mul__, range(n-r+1, n+1), 1L) // factorial(r)
short, accurate and efficient because this avoids python big integers by sticking with longs.
It is more accurate and faster when comparing to scipy.special.comb:
>>> from scipy.special import comb
>>> nCr = lambda n,r: reduce(long.__mul__, range(n-r+1, n+1), 1L) // factorial(r)
>>> comb(128,20)
1.1965669823265365e+23
>>> nCr(128,20)
119656698232656998274400L # accurate, no loss
>>> from timeit import timeit
>>> timeit(lambda: comb(n,r))
8.231969118118286
>>> timeit(lambda: nCr(128, 20))
3.885951042175293