问题
I'm trying to calculate (3e28 choose 2e28)/2^(3e28). I tried scipy.misc.comb to calculate 3e28 choose 2e28 but it gave me inf. When I calculate 2^(3e28), it raised OverflowError: (34, 'Result too large'). How can I compute or estimate (3e28 choose 2e28)/2^(3e28)?
回答1:
Use Stirling's approximation (which is very accurate in the 1e10+ range), combined with logarithms:
(3e28 choose 2e28) / 2^(3e28) = 3e28! / [(3e28 - 2e28)! * 2e28!] / 2^(3e28)
= e^ [log (3e28!) - log((3e28-2e28)!) - log(2e28!) - 3e28 * log(2)]
and from there apply Stirling's approximation:
log n! ~= log(sqrt(2*pi*n)) + n*log(n) - n
and you'll get your answer.
Here's an example of how accurate this approximation is:
>>> import math
>>> math.log(math.factorial(100))
363.73937555556347
>>> math.log((2*math.pi*100)**.5) + 100*math.log(100) - 100
363.7385422250079
For 100!, it's off by less than 0.01% in log-space.
回答2:
You can compute this ratio with the normal approximation to the binomial for large n
. When n
is large, k
has to be relatively close to n/2
for (n choose k) / 2^n
to not be negligible.
Code
Here's some code that will compute this:
def n_choose_k_over_2_pow_n(n, k):
# compute the mean and standard deviation of the normal
# approximation
mu = n / 2.
sigma = np.sqrt(n) * 1/4.
# now transform to a standard normal variable
z = (k - mu) / sigma
return 1/np.sqrt(2*np.pi) * np.exp(-1/2. * z**2)
So that:
>>> n_choose_k_over_2_pow_n(3e28, 2e28)
0.0
>>> n_choose_k_over_2_pow_n(3e28, 1.5e28)
0.3989422804014327
As you can see, the computation underflows. A solution is to compute the log of the answer, which we can do with this code:
def log_n_choose_k_over_2_pow_n(n, k):
# compute the mean and standard deviation of the normal
# approximation
mu = n / 2.
sigma = np.sqrt(n) * 1/4.
# now transform to a standard normal variable
z = (k - mu) / sigma
# return the log of the answer
return -1./2 * (np.log(2 * np.pi) + z**2)
Another quick check:
>>> log_n_choose_k_over_2_pow_n(3e28, 2e28)
-6.6666666666666638e+27
>>> log_n_choose_k_over_2_pow_n(3e28, 1.5e28)
-0.91893853320467267
If we exponentiate these, we'll get our previous answers.
Explanation
We can do this by an appeal to results from statistics. The binomial distribution is given by:
P(K = k) = (n choose k) p^k * p^(n-k)
For large n
, this is well-approximated by the normal distribution with mean n*p
and variance n*p*(1-p)
.
Set p
to be 1/2
. Then we have:
P(K = k) = (n choose k) (1/2)^k * (1/2)^(n-k)
= (n choose k) (1/2)^n
= (n choose k) / (2^n)
Which is precisely the form of your ratio. Therefore, after a transformation to a standard normal variable with mean n/2
and variance n/4
, we can compute your ratio by a simple evaluation of the standard normal distribution pdf.
回答3:
The following uses log2comb
from my answer here:
from math import log
from scipy.special import gammaln
def log2comb(n, k):
return (gammaln(n+1) - gammaln(n-k+1) - gammaln(k+1)) / log(2)
log2p = log2comb(3e28, 2e28) - 3e28
print "log2p =", log2p
which prints
log2p = -2.45112497837e+27
So the base-2 logarithm of your number is about -2.45e27. If you try to compute 2**log2p, you get 0. That is, the number is smaller than the smallest positive number representable with standard 64 bit floating point numbers.
回答4:
There are python libraries that allow you to do arbitrary precision arithmetic. For example mpmath as used in SymPy.
You will have to rewrite your code to use the library functions though.
http://docs.sympy.org/latest/modules/mpmath/basics.html?highlight=precision
Edit: I just noticed the size of the numbers you are dealing with - much too large for my suggestion.
来源:https://stackoverflow.com/questions/27517186/calculate-very-large-number-using-python