I have a sum of sums that I want to speed up. In one case it is:
S_{x,y,k,l} Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly}
In the other case it is:
S_{x,y} ( S_{k,l} F
I'll start a new answer since the problem has changed.
Try this:
E = np.einsum('uk, vl, xk, yl, xy, kl->uvxy', Fu, Fv, Fx, Fy, P, B)
E1 = np.einsum('uvxy->uv', E)
E2 = np.einsum('uvxy->uv', np.square(E))
I've found it runs just as fast as the time for I1_.
Here is my test code: http://pastebin.com/ufwy7cLy
(Update: Jump to the end to see the result expressed as a couple of matrix multiplications.)
I think you can greatly simplify the computation by using the identity:
For instance,
S_{k,l} Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly}
= S_{k,l} Fu_{ku} Fx_{kx} Fv_{lv} Fy_{ly} -- rearrange the factors
\___ A ____/ \___ B ____/
= ( S_k Fu_{ku} Fx_{kx} ) * ( S_l Fv_{lv} Fy_{ly} ) -- from the identity
= A_{ux} * B_{vy}
where A_{ux}
only depends on u
and x
and B_{vy}
only depends on v
and y
.
For the square sum, we have:
S_k [ S_l Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly} ]^2
= S_k Fu_{ku} Fx_{kx} * [ S_l Fv_{lv} Fy_{ly} ]^2
= S_k Fu_{ku} Fx_{kx} * B_{vy}^2 -- B is from the above calc.
= B_{vy}^2 * S_k Fu_{ku} Fx_{kx} -- B_vy is free of k
= B_{vy}^2 * A_{ux} -- A is from the above calc.
Similar reductions occur when continuing the sum over x
and y
:
S_{xy} A_{ux} * B_{vy}
= S_x A_{ux} * S_y B_{vy} -- from the identity
= C_u * D_v
And then finally summing over u
and v
:
S_{uv} C_u D_v = (S_u C_u) * (S_v D_v) -- from the identity
Hope this helps.
Update: I just realized that perhaps for the square sum you wanted to compute
[ S_k S_l ... ]^2
in which case you can proceed like this:
[ S_k S_l Fu_{ku} Fv_{lv} Fx_{kx} Fy_{ly} ]^2
= [ A_{ux} * B_{vy} ]^2
= A_{ux}^2 * B_{vy}^2
So when we sum over the over variables we get:
S_{uvxy} A_{ux}^2 B_{vy}^2
= S_{uv} ( S_{xy} A_{ux}^2 B_{vy}^2 )
= S_{uv} ( S_x A_{ux}^2 ) * ( S_y B_{vy}^2 ) -- from the identity
= S_{uv} C_u * D_v
= (S_u C_u) * (S_v D_v) -- from the identity
Update 2: This does boil down to just a few matrix multiplications.
The definitions of A and B:
A_{uv} = S_k Fu_{ku} Fx_{kx}
B_{vy} = S_l Fv_{lv} Fy_{ly}
may also be written in matrix form as:
A = (transpose Fu) . Fx -- . = matrix multiplication
B = (transpose Fv) . Fy
and the definition of C and D:
C_u = S_x A_{ux}
D_v = S_y B_{vy}
we see that the vector C is just the row sums of A and the vector D is just the row sums of B. Since the answer for the entire summation (not squared) is:
total = (S_u C_u) * (S_v D_v)
we see that the total is just the sum of all of the matrix elements of A times the sum of all of the matrix elements of B.
Here is the numpy code:
from numpy import *
# ... set up Fx, Fv, Fu, Fy as above...
A = Fx.dot(Fu.transpose())
B = Fv.dot(Fy.transpose())
sum1 = sum(A) * sum(B)
A2 = square(A)
B2 = square(B)
sum2 = sum(A2) * sum(B2)
print "sum of terms:", sum1
print "sum of squares of terms:", sum2