Using AVX intrinsics instead of SSE does not improve speed — why?

前端 未结 4 829
借酒劲吻你
借酒劲吻你 2021-01-30 03:59

I\'ve been using Intel\'s SSE intrinsics for quite some time with good performance gains. Hence, I expected the AVX intrinsics to further speed-up my programs. This, unfortunate

4条回答
  •  礼貌的吻别
    2021-01-30 04:47

    This is because VSQRTPS (AVX instruction) takes exactly twice as many cycles as SQRTPS (SSE instruction) on a Sandy Bridge processor. See Agner Fog's optimize guide: instruction tables, page 88.

    Instructions like square root and division don't benefit from AVX. On the other hand, additions, multiplications, etc., do.

提交回复
热议问题