I am looking for advice regarding high performance multi-dimensional array libraries/classes for C++. What I really need is:
the ability to dynamically allocate
From a performance perspective, I have tried boost::MultiArray and Armadillo. Neither were fast, in that both had slow access time compared to arrays or vectors, and I was able to beat these packages in an operation such as x1(4:10) = x2(1:6) + x2(2:7) + x2(3:8) by using a simple hand-coded loop (with the help of my compiler's optimization, I'm sure). When you get into matrix multiplication, these packages might offer some benefit via LAPACK and BLAS but you can always use those interfaces on your own.