How can I get an intrinsic for the exp() function in x64 code?

我与影子孤独终老i 提交于 2019-12-04 02:35:46
David Heffernan

On x64, floating point arithmetic is performed using SSE. This does not have a built-in operation for exp() and so a call to the standard library is inevitable unless you write your own inline manually-vectorized __m128d exp(__m128d) (Fastest Implementation of Exponential Function Using SSE).

I imagine that the MSDN article you are referring to was written with 32 bit code that uses 8087 FP in mind.

I think the only reason that Microsoft provides an intrinsic version of 32-bit SSE2 exp() is the standard calling conventions. The 32-bit calling conventions require the operand to be pushed on the main stack, and the result to be returned in the top register of the FPU stack. If you have SSE2 code generation enabled, then the return value is likely to be popped from the FPU stack into memory, then loaded from that location into an SSE2 register for whatever maths you want to do on the result. Clearly, it is faster to pass the operand in an SSE2 register and return the result in an SSE2 register. This is what __libm_sse2_exp() does. In 64-bit code, the standard calling convention passes the operand and returns the result in SSE2 registers anyway, so there is no advantage in having an intrinsic version.

The reason for the performance difference between 32-bit SSE2 and 64-bit implementations of exp() is that Microsoft uses different algorithms in the two implementations. I've no idea why they do this, and they produce different results (different by 1ulp) for some operands.

GregC

EDIT I'd like to add to this discussion the link to AMD's x64 instruction set manuals and Intel's reference.

At an initial inspection, there should be a way to use F2XM1 to compute the exponential. However, it's in the x87 instruction set, hidden in x64 mode.

There's hope in using MMX/x87 explicitly, as described in a posting on VirtualDub discussion boards. And, this is how to actually write asm in VC++.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!