Element-wise vector-vector multiplication in BLAS?

前端 未结 4 922
孤独总比滥情好
孤独总比滥情好 2020-12-29 10:53

Is there a means to do element-wise vector-vector multiplication with BLAS, GSL or any other high performance library ?

相关标签:
4条回答
  • 2020-12-29 11:23

    In GSL, gsl_vector_mul does the trick.

    0 讨论(0)
  • 2020-12-29 11:35

    I found that MKL has a whole set of mathematical operations on vector, in its Vector Mathematical Functions Library (VML), including v?Mul, which does what I want. It works with c++ arrays, so it's more convenient for me than GSL.

    0 讨论(0)
  • 2020-12-29 11:36

    There is always std::valarray1 which defines elementwise operations that are frequently (Intel C++ /Quse-intel-optimized-headers, G++) compiled into SIMD instructions if the target supports them.

    • http://software.intel.com/sites/products/documentation/hpc/composerxe/en-us/cpp/mac/cref_cls/common/cppref_valarray_intro.htm

    Both these compilers will also do auto-vectorization

    • http://software.intel.com/en-us/articles/getting-code-ready-for-parallel-execution-with-intel-parallel-composer/
    • http://gcc.gnu.org/projects/tree-ssa/vectorization.html

    In that case you can just write

    #define N 10000 
    
    float a[N], b[N], c[N]; 
    
    void f1() { 
      for (int i = 1; i < N; i++) 
      c[i] = a[i] + b[i]; 
    } 
    

    and see it compile into vectorized code (using SSE4 e.g.)

    1 Yes they are archaic and often thought of as obsolete, but in practice they are both standard and fit the task very well.

    0 讨论(0)
  • 2020-12-29 11:41

    (Taking the title of the question literally...)

    Yes it can be done with BLAS alone (though it is probably not the most efficient way.)

    The trick is to treat one of the input vectors as a diagonal matrix:

    ⎡a    ⎤ ⎡x⎤    ⎡ax⎤
    ⎢  b  ⎥ ⎢y⎥ =  ⎢by⎥
    ⎣    c⎦ ⎣z⎦    ⎣cz⎦
    

    You can then use one of the matrix-vector multiply functions that can take a diagonal matrix as input without padding, e.g. SBMV

    Example:

    void ebeMultiply(const int n, const double *a, const double *x, double *y)
    {
        extern void dsbmv_(const char *uplo,
                           const int *n,
                           const int *k,
                           const double *alpha,
                           const double *a,
                           const int *lda,
                           const double *x,
                           const int *incx,
                           const double *beta,
                           double *y,
                           const int *incy);
    
        static const int k = 0; // Just the diagonal; 0 super-diagonal bands
        static const double alpha = 1.0;
        static const int lda = 1;
        static const int incx = 1;
        static const double beta = 0.0;
        static const int incy = 1;
    
        dsbmv_("L", &n, &k, &alpha, a, &lda, x, &incx, &beta, y, &incy);
    }
    
    // Test
    #define N 3
    static const double a[N] = {1,3,5};
    static const double b[N] = {1,10,100};
    static double c[N];
    
    int main(int argc, char **argv)
    {
        ebeMultiply(N, a, b, c);
        printf("Result: [%f %f %f]\n", c[0], c[1], c[2]);
        return 0;
    }
    

    Result: [1.000000 30.000000 500.000000]

    0 讨论(0)
提交回复
热议问题