问题
Question
What numpy function to use for mathematical dot product in the case below?
- Backpropagation for a Linear Layer
回答1:
Define sample (2,3) array:
In [299]: dldx = np.arange(6).reshape(2,3)
In [300]: w
Out[300]:
array([[0.1, 0.2, 0.3],
[0. , 0. , 0. ]])
Element wise multiplication:
In [301]: dldx*w
Out[301]:
array([[0. , 0.2, 0.6],
[0. , 0. , 0. ]])
and summing on the last axis (size 3) produces a 2 element array:
In [302]: (dldx*w).sum(axis=1)
Out[302]: array([0.8, 0. ])
Your (6) is the first term, dropping the 0. One might argue that the use of a dot/inner in (5) is a bit sloppy.
np.einsum
borrows ideas from physics, where dimensions may be higher. This case can be expressed as
In [303]: np.einsum('ij,ik->i',dldx,w)
Out[303]: array([1.8, 0. ])
inner
and dot
do more calculations that we want. We just want the diagonal:
In [304]: np.dot(dldx,w.T)
Out[304]:
array([[0.8, 0. ],
[2.6, 0. ]])
In [305]: np.inner(dldx,w)
Out[305]:
array([[0.8, 0. ],
[2.6, 0. ]])
In matmul/@
terms, the size 2 dimension is a 'batch' one, so we have to add dimensions:
In [306]: dldx[:,None,:]@w[:,:,None]
Out[306]:
array([[[0.8]],
[[0. ]]])
This is (2,1,1), so we need to squeeze out the 1s.
来源:https://stackoverflow.com/questions/65668129/numpy-function-to-use-for-mathematical-dot-product-to-produce-scalar