Why don't C++ compilers do better constant folding?

后端未结

关注

 3  700

迷失自我 2021-01-30 16:18

I\'m investigating ways to speed up a large section of C++ code, which has automatic derivatives for computing jacobians. This involves doing some amount of work in the actual r

3条回答

礼貌的吻别 (楼主)

2021-01-30 16:40

One way to force a compiler to optimize multiplications by 0's and 1`s is to manually unroll the loop. For simplicity let's use

#include 
#include 
constexpr std::size_t n = 12;
using Array = std::array;

Then we can implement a simple dot function using fold expressions (or recursion if they are not available):


template
double dot(const Array& x, const Array& y, std::index_sequence)
{
    return ((x[is] * y[is]) + ...);
}

double dot(const Array& x, const Array& y)
{
    return dot(x, y, std::make_index_sequence{});
}

Now let's take a look at your function

double test(const Array& b)
{
    const Array a{1};    // = {1, 0, ...}
    return dot(a, b);
}

With -ffast-math gcc 8.2 produces:

test(std::array const&):
  movsd xmm0, QWORD PTR [rdi]
  ret

clang 6.0.0 goes along the same lines:

test(std::array const&): # @test(std::array const&)
  movsd xmm0, qword ptr [rdi] # xmm0 = mem[0],zero
  ret

For example, for

double test(const Array& b)
{
    const Array a{1, 1};    // = {1, 1, 0...}
    return dot(a, b);
}

we get

test(std::array const&):
  movsd xmm0, QWORD PTR [rdi]
  addsd xmm0, QWORD PTR [rdi+8]
  ret

Addition. Clang unrolls a for (std::size_t i = 0; i < n; ++i) ... loop without all these fold expressions tricks, gcc doesn't and needs some help.

0 讨论(0)

查看其它3个回答