Fast SSE low precision exponential using double precision operations

前端 未结 1 1130
广开言路
广开言路 2021-01-12 16:37

I am looking for for a fast-SSE-low-precision (~1e-3) exponential function.

I came across this great answer:

/* max. rel. error = 3.55959567e-2 on [-         


        
相关标签:
1条回答
  • 2021-01-12 17:25

    Something like this should do the job. You need to tune the 1.05 constant to get a lower maximal error -- I'm too lazy to do that:

    __m128d fastexp(const __m128d &x)
    {
        __m128d scaled = _mm_add_pd(_mm_mul_pd(x, _mm_set1_pd(1.0/std::log(2.0)) ), _mm_set1_pd(3*1024.0-1.05));
    
        return _mm_castsi128_pd(_mm_slli_epi64(_mm_castpd_si128(scaled), 11));
    }
    

    This just gets about 2.5% relative precision -- for better precision you may need to add a second term.

    Also, for values which overflow or underflow this will result in unspecified values, you can avoid this by clamping the scaled value to some values.

    0 讨论(0)
提交回复
热议问题