问题
(1) The simple version of the problem:
How to calculate log(P1+P2+...+Pn), given log(P1), log(P2), ..., log(Pn), without taking the exp of any terms to get the original Pi. I don't want to get the original Pi because they are super small and may cause numeric computer underflow.
(2) The long version of the problem:
I am using Bayes' Theorem to calculate a conditional probability P(Y|E).
P(Y|E) = P(E|Y)*P(Y) / P(E)
I have a thousand probabilities multiplying together.
P(E|Y) = P(E1|Y) * P(E2|Y) * ... * P(E1000|Y)
To avoid computer numeric underflow, I used log(p) and calculate the summation of 1000 log(p) instead of calculating the product of 1000 p.
log(P(E|Y)) = log(P(E1|Y)) + log(P(E2|Y)) + ... + log(P(E1000|Y))
However, I also need to calculate P(E), which is
P(E) = sum of P(E|Y)*P(Y)
log(P(E))
does not equal to the sum of log(P(E|Y)*P(Y))
. How should I get log(P(E))
without solving for P(E|Y)*P(Y)
(they are extremely small numbers) and adding them.
回答1:
You can use
log(P1+P2+...+Pn) = log(P1[1 + P2/P1 + ... + Pn/P1])
= log(P1) + log(1 + P2/P1 + ... + Pn/P1])
which works for any Pi. So factoring out maxP = max_i Pi results in
log(P1+P2+...+Pn) = log(maxP) + log(1+P2/maxP + ... + Pn/maxP)
where all the ratios are less than 1.
来源:https://stackoverflow.com/questions/22009862/how-to-calculate-logsum-of-terms-from-its-component-log-terms