I have two 2-d numpy arrays with the same dimensions, A and B, and am trying to calculate the row-wise dot product of them. I could do:
np.sum(A * B, axis=1)
>
Even though it is significantly slower for even moderate data sizes, I would use
np.diag(A.dot(B.T))
while you are developing the library and worry about optimizing it later when it will run in a production setting, or after the unit tests are written.
To most people who come upon your code, this will be more understandable than einsum
, and also doesn't require you to break some best practices by embedding your calculation within a mini DSL string to serve as the argument to some function call.
I agree that computing the off-diagonal elements is worth avoiding for large cases. It would have to be really really large for me to care about that though, and the trade-off for paying the awful price of expressing the calculating in an embedded string in einsum
is pretty severe.