Logarithm with SSE, or switch to FPU?

∥☆過路亽.° 提交于 2019-12-30 08:23:21

问题


I'm doing some statistics calculations. I need them to be fast, so I rewrote most of it to use SSE. I'm pretty much new to it, so I was wondering what the right approach here is:

To my knowledge, there is no log2 or ln function in SSE, at least not up to 4.1, which is the latest version supported by the hardware I use.

Is it better to:

  1. extract 4 floats, and do FPU calculations on them to determine enthropy - I won't need to load any of those values back into SSE registers, just sum them up to another float
  2. find a function for SSE that does log2

回答1:


There seem to be a few SSE log2 implementations around, e.g. this one.

There is also the Intel Approximate Maths Library which has a log2 function among others - it's old (2000) but it's SSE2 and it should still work reasonably well.


See also:
  • sse_mathfun - SSE vector math library
  • avx_mathfun - AVX vector math library
  • libmvec - vector math library added in glibc 2.22



回答2:


There is no SSE instruction that implements a logarithm function. However, there's also no single x86 instruction that performs a generic logarithm either. If you're thinking about using a logarithm function like log or log10 from the C standard library, it's worth taking a look at the implementation that is used in an open-source library like libc. You can easily roll your own logarithm approximation that operates across all elements in an SSE register.

Such a function is often implemented using a polynomial approximation that is valid within some accuracy specification over a certain region of input arguments, such as a Taylor series. You can then take advantage of logarithm properties to wrap a generic input argument into the acceptable input range for your logarithm routine. In addition, you can parameterize the base of the logarithm by taking advantage of the property:

log_y(x) = log_a(x) / log_a(y)

Where a is the base of the logarithm routine that you created.



来源:https://stackoverflow.com/questions/8902971/logarithm-with-sse-or-switch-to-fpu

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!