Say I have two very big matrices A
(M-by-N) and B
(N-by-M). I need the diagonal of A*B
. Computing the full A*B
requires M
Yes, this is one of the rare cases where a for loop is better.
I ran the following script through the profiler:
M = 5000;
N = 5000;
A = rand(M, N); B = rand(N, M);
product = A*B;
diag1 = diag(product);
A = rand(M, N); B = rand(N, M);
diag2 = diag(A*B);
A = rand(M, N); B = rand(N, M);
diag3 = zeros(M,1);
for i=1:M
diag3(i) = A(i,:) * B(:,i);
end
I reset A and B between each test just in case MATLAB would try to speed anything up by caching.
Result (edited for brevity):
time calls line
6.29 1 5 product = A*B;
< 0.01 1 6 diag1 = diag(product);
5.46 1 9 diag2 = diag(A*B);
1 12 diag3 = zeros(M,1);
1 13 for i=1:M
0.52 5000 14 diag3(i) = A(i,:) * B(:,i);
< 0.01 5000 15 end
As we can see, the for loop variant is an order of magnitude faster than the other two in this case. While the diag(A*B)
variant is actually faster than the diag(product)
variant, it's marginal at best.
I tried some different values of M and N, and in my tests the for loop variant is slower only if M=1.