SSE slower than FPU?

前端 未结 3 738
孤城傲影
孤城傲影 2021-02-08 19:32

I have a large piece of code, part of whose body contains this piece of code:

result = (nx * m_Lx + ny * m_Ly + m_Lz) / sqrt(nx * nx + ny * ny + 1);
3条回答
  •  小鲜肉
    小鲜肉 (楼主)
    2021-02-08 20:02

    My take would be that the processor has the time to compute the first multiplication when using the FPU while loading the next values. The SSE has to load all the values first.

提交回复
热议问题