I need to perform a special type of tensor contraction. I want something of this kind:
A_{bg} = Sum_{a,a\',a\'\'} ( B_{a} C_{a\'b} D_{a\'\'g} )
First, lets write your example out in Python loops, to have a baseline for comparisons. If I understood you correctly, this is what you want to do:
b, g = 4, 5
B = np.random.rand(2)
C = np.random.rand(2, b)
D = np.random.rand(2, g)
out = np.zeros((b, g))
for j in (0, 1):
for k in (0, 1):
for l in (0, 1):
if j + k + l in (1, 2):
out += B[j] * C[k, :, None] * D[l, None, :]
When I run this, I get this output:
>>> out
array([[ 1.27679643, 2.26125361, 1.32775173, 1.5517918 , 0.47083151],
[ 0.84302586, 1.57516142, 1.1335904 , 1.14702252, 0.34226837],
[ 0.70592576, 1.34187278, 1.02080112, 0.99458563, 0.29535054],
[ 1.66907981, 3.07143067, 2.09677013, 2.20062463, 0.65961165]])
You can't get at this directly with np.einsum
, but you can run it twice and get your result as the difference of these two:
>>> np.einsum('i,jk,lm->km', B, C, D) - np.einsum('i,ik,im->km', B, C, D)
array([[ 1.27679643, 2.26125361, 1.32775173, 1.5517918 , 0.47083151],
[ 0.84302586, 1.57516142, 1.1335904 , 1.14702252, 0.34226837],
[ 0.70592576, 1.34187278, 1.02080112, 0.99458563, 0.29535054],
[ 1.66907981, 3.07143067, 2.09677013, 2.20062463, 0.65961165]])
The first call to np.einsum
is adding everything up, regardless of what the indices add up to. The second only adds up those where all three indices are the same. So obviously your result is the difference of the two.
Ideally, you could now go on to write something like:
>>>(np.einsum('i...,j...,k...->...', B, C, D) -
... np.einsum('i...,i...,i...->...', B, C, D))
and get your result regardless of the dimensions of your C and D arrays. If you try the first, you will get the following error message:
ValueError: operands could not be broadcast together with remapped shapes
[original->remapped]: (2)->(2,newaxis,newaxis) (2,4)->(4,newaxis,2,newaxis)
(2,5)->(5,newaxis,newaxis,2)
The problem is that, since you are not specifying what you want to do with the b
and g
dimensions of your tensors, it tries to broadcast them together, and since they are different, it fails. You can get it to work by adding extra dimensions of size 1:
>>> (np.einsum('i...,j...,k...->...', B, C, D[:, None]) -
... np.einsum('i...,i...,i...->...', B, C, D[:, None]))
array([[ 1.27679643, 2.26125361, 1.32775173, 1.5517918 , 0.47083151],
[ 0.84302586, 1.57516142, 1.1335904 , 1.14702252, 0.34226837],
[ 0.70592576, 1.34187278, 1.02080112, 0.99458563, 0.29535054],
[ 1.66907981, 3.07143067, 2.09677013, 2.20062463, 0.65961165]])
If you wanted all the axes of B to be placed before all the axes of C, and these before all the axes of D, the following seems to work, at least as far as creating an output of the right shape, although you may want to double check that the result is really what you want:
>>> B = np.random.rand(2, 3)
>>> C = np.random.rand(2, 4, 5)
>>> D = np.random.rand(2, 6)
>>> C_idx = (slice(None),) + (None,) * (B.ndim - 1)
>>> D_idx = C_idx + (None,) * (C.ndim - 1)
>>> (np.einsum('i...,j...,k...->...', B, C[C_idx], D[D_idx]) -
... np.einsum('i...,i...,i...->...', B, C[C_idx], D[D_idx])).shape
(3L, 4L, 5L, 6L)
EDIT From the comments, if instead of just the first axis of each tensor having to be reduced over, it is the first two, then the above could be written as:
>>> B = np.random.rand(2, 2, 3)
>>> C = np.random.rand(2, 2, 4, 5)
>>> D = np.random.rand(2, 2, 6)
>>> C_idx = (slice(None),) * 2 + (None,) * (B.ndim - 2)
>>> D_idx = C_idx + (None,) * (C.ndim - 2)
>>> (np.einsum('ij...,kl...,mn...->...', B, C[C_idx], D[D_idx]) -
... np.einsum('ij...,ij...,ij...->...', B, C[C_idx], D[D_idx])).shape
(3L, 4L, 5L, 6L)
More generally, if reducing over d
indices, C_idx
and D_idx
would look like:
>>> C_idx = (slice(None),) * d + (None,) * (B.ndim - d)
>>> D_idx = C_idx + (None,) * (C.ndim - d)
and the calls to np.einsum
would need to have d
letters in the indexing, unique in the first call, repeating in the second.
EDIT 2 So what exactly goes on with C_idx
and D_idx
? Take the last example, with B
, C
and D
with shapes (2, 2, 3)
, (2, 2, 4, 5)
and (2, 2, 6)
. C_idx
is made up of two empty slices, plus as many None
s as the number of dimensions of B
minus 2, so when we take C[C_idx]
the result has shape (2, 2, 1, 4, 5)
. Similarly D_idx
is C_idx
plus as many None
s as the number of dimensions of C
minus 2, so the result of D[D_idx]
has shape (2, 2, 1, 1, 1, 6)
. These three arrays don't braodcast together, but np.einsum
adds additional dimensions of size 1, i.e. the "remapped" shapes of the error above, so the resulting arrays turn out to have extra trailing ones, and the shapes amtch as follows:
(2, 2, 3, 1, 1, 1)
(2, 2, 1, 4, 5, 1)
(2, 2, 1, 1, 1, 6)
The first two axes are reduced, so the disappear from the output, and in the other cases, broadcasting applies, where a dimension of size 1 is replicated to match a larger one, so the output is (3, 4, 5, 6)
as we wanted.
@hpaulj proposes a method using "Levi-Civita like" tensors, that should in theory be faster, see my comments to the original question. Here's some code for comparison:
b, g = 5000, 2000
B = np.random.rand(2)
C = np.random.rand(2, b)
D = np.random.rand(2, g)
def calc1(b, c, d):
return (np.einsum('i,jm,kn->mn', b, c, d) -
np.einsum('i,im,in->mn', b, c, d))
def calc2(b, c, d):
return np.einsum('ijk,i,jm,kn->mn', calc2.e, b, c, d)
calc2.e = np.ones((2,2,2))
calc2.e[0, 0, 0] = 0
calc2.e[1, 1, 1] = 0
But when running it:
%timeit calc1(B, C, D)
1 loops, best of 3: 361 ms per loop
%timeit calc2(B, C, D)
1 loops, best of 3: 643 ms per loop
np.allclose(calc1(B, C, D), calc2(B, C, D))
Out[48]: True
A surprising result, which I can't explain...