I\'m new to python and numpy in general. I read several tutorials and still so confused between the differences in dim, ranks, shape, aixes and dimensions. My mind seems to be s
np.dot
is a generalization of matrix multiplication.
In regular matrix multiplication, an (N,M)-shape matrix multiplied with a (M,P)-shaped matrix results in a (N,P)-shaped matrix. The resultant shape can be thought of as being formed by squashing the two shapes together ((N,M,M,P)
) and then removing the middle numbers, M
(to produce (N,P)
). This is the property that np.dot
preserves while generalizing to arrays of higher dimension.
When the docs say,
"For N dimensions it is a sum product over the last axis of a and the second-to-last of b".
it is speaking to this point. An array of shape (u,v,M)
dotted with an array of shape (w,x,y,M,z)
would result in an array of shape (u,v,w,x,y,z)
.
Let's see how this rule looks when applied to
In [25]: V = np.arange(2); V
Out[25]: array([0, 1])
In [26]: M = np.arange(4).reshape(2,2); M
Out[26]:
array([[0, 1],
[2, 3]])
First, the easy part:
In [27]: np.dot(M, V)
Out[27]: array([1, 3])
There is no surprise here; this is just matrix-vector multiplication.
Now consider
In [28]: np.dot(V, M)
Out[28]: array([2, 3])
Look at the shape of V and M:
In [29]: V.shape
Out[29]: (2,)
In [30]: M.shape
Out[30]: (2, 2)
So np.dot(V,M)
is like matrix multiplication of a (2,)-shaped matrix with a (2,2)-shaped matrix, which should result in a (2,)-shaped matrix.
The last (and only) axis of V
and the second-to-last axis of M
(aka the first axis of M
) are multiplied and summed over, leaving only the last axis of M
.
If you want to visualize this: np.dot(V, M)
looks as though V has 1 row and 2 columns:
[[0, 1]] * [[0, 1],
[2, 3]]
and so, when V is multiplied by M, np.dot(V, M)
equals
[[0*0 + 1*2], [2,
[0*1 + 1*3]] = 3]
However, I don't really recommend trying to visualize NumPy arrays this way -- at least I never do. I focus almost exclusively on the shape.
(2,) * (2,2)
\ /
\ /
(2,)
You just think about the "middle" axes being dotted, and disappearing from the resultant shape.
np.sum(arr, axis=0)
tells NumPy to sum the elements in arr
eliminating the 0th axis. If arr
is 2-dimensional, the 0th axis are the rows. So for example, if arr
looks like this:
In [1]: arr = np.arange(6).reshape(2,3); arr
Out[1]:
array([[0, 1, 2],
[3, 4, 5]])
then np.sum(arr, axis=0)
will sum along the columns, thus eliminating the 0th axis (i.e. the rows).
In [2]: np.sum(arr, axis=0)
Out[2]: array([3, 5, 7])
The 3 is the result of 0+3, the 5 equals 1+4, the 7 equals 2+5.
Notice arr
had shape (2,3), and after summing, the 0th axis is removed so the result is of shape (3,). The 0th axis had length 2, and each sum is composed of adding those 2 elements. The shape (2,3) "becomes" (3,). You can know the resultant shape in advance! This can help guide your thinking.
To test your understanding, consider np.sum(arr, axis=1)
. Now the 1-axis is removed. So the resultant shape will be (2,)
, and element in the result will be the sum of 3 values.
In [3]: np.sum(arr, axis=1)
Out[3]: array([ 3, 12])
The 3 equals 0+1+2, and the 12 equals 3+4+5.
So we see that summing an axis eliminates that axis from the result. This has bearing on np.dot
, since the calculation performed by np.dot
is a sum of products. Since np.dot
performs a summing operation along certain axes, that axis is removed from the result. That is why applying np.dot
to arrays of shape (2,) and (2,2) results in an array of shape (2,). The first 2 in both arrays is summed over, eliminating both, leaving only the second 2 in the second array.