Rounding of double precision to single precision: Forcing an upper bound

泪湿孤枕 提交于 2019-12-11 04:22:29

问题


I'm using a Mersenne Twister implementation which provides me numbers with double precision.

http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/VERSIONS/FORTRAN/fortran.html (implementation in Fortran 77 by Tsuyoshi Tada, I'm using genrand_real2)

However, my application needs, in order to avoid warnings while multiplying numbers with different precisions, a single precision random number. So, I wrote a small function to convert between the two data types:

    function genrand_real()

    real   genrand_real
    real*8 genrand_real2

    genrand_real = real(genrand_real2())

    return
    end

I'm using real and real*8 to be consistent with the code I'm working on. It works perfectly most of the time (besides de fact that I'm not sure about how fast real() is), however it changes the upper bound of my RNG, since the conversion changes the [0,1) to [0,1]. I've never thought about that until I've had problems with it.

My question is, how can I ensure the upper bound in an efficient way, or even how could I write a function similar to genrand_real2 (the original one) that provides me single precision reals. My guess is I only need to replace the divisor 4294967296.d0 but I don't know by which number

  function genrand_real2()

  double precision genrand_real2,r
  integer genrand_int32
  r=dble(genrand_int32())
  if(r.lt.0.d0)r=r+2.d0**32
  genrand_real2=r/4294967296.d0

  return
  end

回答1:


The function you posted does NOT generate random numbers, it only limits random integers (from genrand_int32()) to the interval [0,1) by dividing by 2^32 (which is exactly 4294967296) or adding 2^32 first if the int is negative. 2^32 is the number of values that a standard integer can hold, one half negative, one half positive (approximately, there is 1 missing at the positive end) and therefore comes from the function genrand_int32().

Imagine you had numbers from -10 to 10 and wanted to restrict them to the interval [0,1]. The easiest solution is to add 20 to the negative numbers (so positive stay 0-10 and negative become 10-20) and then divide by 20. That's exactly what the function is doing, just with 2^31 instead of 10.

If you are wondering why the interval for your function is [0, 1): Since the number 0 also needs a spot and the bit-representation can only store 2^32 numbers, you can't have 2^31 negative and 2^31 positive numbers AND 0. The solution is to leave out the value +2^31 (highest positive one) and consequently 1 is excluded from your interval.

So to bring the whole thing down to single-precission:

function genrand_real2()

real genrand_real2,r
integer genrand_int32
r=real(genrand_int32())
if(r.lt.0)r=r+2**32
genrand_real2=r/4294967296

return
end

The magic numbers have to stay the same, because they relate to the integers, not the reals.

Edit: You already said it yourself, so I'm just repeating for other people: For portability it is technically not a good idea to use default types without specifying the precision. So you should do sp = selected_real_kind(6, 37) (sp for single precision) somewhere and then real(kind=sp)... and 2.0_sp and so forth. However, this is more of an academic point.



来源:https://stackoverflow.com/questions/37837909/rounding-of-double-precision-to-single-precision-forcing-an-upper-bound

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!