I\'m struggling to understand exactly how einsum
works. I\'ve looked at the documentation and a few examples, but it\'s not seeming to stick.
Here\'s an
I think the simplest example is in tensorflow docs
There are four steps to convert your equation to einsum notation. Lets take this equation as an example C[i,k] = sum_j A[i,j] * B[j,k]
ik = sum_j ij * jk
sum_j
term as it is implicit. We get ik = ij * jk
*
with ,
. We get ik = ij, jk
->
sign. We get ij, jk -> ik
The einsum interpreter just runs these 4 steps in reverse. All indices missing in the result are summed over.
Here are some more examples from the docs
# Matrix multiplication
einsum('ij,jk->ik', m0, m1) # output[i,k] = sum_j m0[i,j] * m1[j, k]
# Dot product
einsum('i,i->', u, v) # output = sum_i u[i]*v[i]
# Outer product
einsum('i,j->ij', u, v) # output[i,j] = u[i]*v[j]
# Transpose
einsum('ij->ji', m) # output[j,i] = m[i,j]
# Trace
einsum('ii', m) # output[j,i] = trace(m) = sum_i m[i, i]
# Batch matrix multiplication
einsum('aij,ajk->aik', s, t) # out[a,i,k] = sum_j s[a,i,j] * t[a, j, k]