Dealing with very small numbers in R

六月ゝ 毕业季﹏ 提交于 2019-11-27 02:45:01

问题


I need to calculate a list of very small numbers such as

(0.1)^1000, 0.2^(1200),

and then normalize them so they will sum up to one i.e.

a1 = 0.1^1000, a2 = 0.2^1200

And I want to calculate a1' = a1/(a1+a2), a2'=a2(a1+a2).

I'm running into underflow problems, as I get a1=0. How can I get around this? Theoretically I could deal with logs, and then log(a1) = 1000*log(0.l) would be a way to represent a1 without underflow problems - But in order to normalize I would need to get log(a1+a2) - which I can't compute since I can't represent a1 directly.

I'm programming with R - as far as I can tell there is no data type such Decimal in c# which allows you to get better than double-precision value.

Any suggestions will be appreciated, thanks


回答1:


Mathematically spoken, one of those numbers will be appx. zero, and the other one. The difference between your numbers is huge, so I'm even wondering if this makes sense.

But to do that in general, you can use the idea from the logspace_add C-function that's underneath the hood of R. One can define logxpy ( =log(x+y) ) when lx = log(x) and ly = log(y) as :

logxpy <- function(lx,ly) max(lx,ly) + log1p(exp(-abs(lx-ly)))

Which means that we can use :

> la1 <- 1000*log(0.1)
> la2 <- 1200*log(0.2)

> exp(la1 - logxpy(la1,la2))
[1] 5.807714e-162

> exp(la2 - logxpy(la1,la2))
[1] 1

This function can be called recursively as well if you have more numbers. Mind you, 1 is still 1, and not 1 minus 5.807...e-162 . If you really need more precision and your platform supports long double types, you could code everything in eg C or C++, and return the results later on. But if I'm right, R can - for the moment - only deal with normal doubles, so ultimately you'll lose the precision again when the result is shown.


EDIT :

to do the math for you :

log(x+y) = log(exp(lx)+exp(ly))
         = log( exp(lx) * (1 + exp(ly-lx) )
         = lx + log ( 1 + exp(ly - lx)  )

Now you just take the largest as lx, and then you come at the expression in logxpy().

EDIT 2 : Why take the maximum then? Easy, to assure that you use a negative number in exp(lx-ly). If lx-ly gets too big, then exp(lx-ly) would return Inf. That's not a correct result. exp(ly-lx) would return 0, which allows for a far better result:

Say lx=1 and ly=1000, then :

> 1+log1p(exp(1000-1))
[1] Inf
> 1000+log1p(exp(1-1000))
[1] 1000



回答2:


The Brobdingnag package deals with very large or small numbers, essentially wrapping Joris's answer into a convenient form.

a1 <- as.brob(0.1)^1000
a2 <- as.brob(0.2)^1200
a1_dash <- a1 / (a1 + a2)
a2_dash <- a2 / (a1 + a2)
as.numeric(a1_dash)
as.numeric(a2_dash)



回答3:


Maybe you can treat a1 and a2 as fractions. In your example, with

a1 = (a1num/a1denom)^1000  # 1/10
a2 = (a2num/a2denom)^1200  # 1/5

you would arrive at

a1' = (a1num^1000 * a2denom^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)
a2' = (a1denom^1000 * a2num^1200)/(a1num^1000 * a2denom^1200 + a1denom^1000 * a2num^1200)

which can be computed using the gmp package:

library(gmp)
a1 <- as.double(pow.bigz(5,1200) / (pow.bigz(5,1200)+ pow.bigz(10,1000)))



回答4:


Try the arbitrary precision package mpfr.

Ryacas may also be able to do arbitrary precision.



来源:https://stackoverflow.com/questions/5802592/dealing-with-very-small-numbers-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!