Why don't C++ compilers do better constant folding?

后端 未结 3 695
迷失自我
迷失自我 2021-01-30 16:18

I\'m investigating ways to speed up a large section of C++ code, which has automatic derivatives for computing jacobians. This involves doing some amount of work in the actual r

3条回答
  •  一整个雨季
    2021-01-30 16:37

    This is because Eigen explicitly vectorize your code as 3 vmulpd, 2 vaddpd and 1 horizontal reduction within the remaining 4 component registers (this assumes AVX, with SSE only you'll get 6 mulpd and 5 addpd). With -ffast-math GCC and clang are allowed to remove the last 2 vmulpd and vaddpd (and this is what they do) but they cannot really replace the remaining vmulpd and horizontal reduction that have been explicitly generated by Eigen.

    So what if you disable Eigen's explicit vectorization by defining EIGEN_DONT_VECTORIZE? Then you get what you expected (https://godbolt.org/z/UQsoeH) but other pieces of code might become much slower.

    If you want to locally disable explicit vectorization and are not afraid of messing with Eigen's internal, you can introduce a DontVectorize option to Matrix and disable vectorization by specializing traits<> for this Matrix type:

    static const int DontVectorize = 0x80000000;
    
    namespace Eigen {
    namespace internal {
    
    template
    struct traits >
    : traits >
    {
      typedef traits > Base;
      enum {
        EvaluatorFlags = Base::EvaluatorFlags & ~PacketAccessBit
      };
    };
    
    }
    }
    
    using ArrayS12d = Eigen::Matrix;
    

    Full example there: https://godbolt.org/z/bOEyzv

提交回复
热议问题