I was growing confused during the development of a small Python script involving matrix operations, so I fired up a shell to play around with a toy example and develop a bet
Imagine you have the following
>> A = np.array([[1,2,3,4],[5,6,7,8],[9,10,11,12]])
If you want to get the second column value, use the following:
>> A.T[1]
array([ 2, 6, 10])
Look at the shape after indexing:
In [295]: A=np.matrix([1,2,3])
In [296]: A.shape
Out[296]: (1, 3)
In [297]: A[0]
Out[297]: matrix([[1, 2, 3]])
In [298]: A[0].shape
Out[298]: (1, 3)
The key to this behavior is that np.matrix
is always 2d. So even if you select one row (A[0,:]
), the result is still 2d, shape (1,3)
. So you can string along as many [0]
as you like, and nothing new happens.
What are you trying to accomplish with A[0][0]
? The same as A[0,0]
?
For the base np.ndarray
class these are equivalent.
Note that Python
interpreter translates indexing to __getitem__
calls.
A.__getitem__(0).__getitem__(0)
A.__getitem__((0,0))
[0][0]
is 2 indexing operations, not one. So the effect of the second [0]
depends on what the first produces.
For an array A[0,0]
is equivalent to A[0,:][0]
. But for a matrix, you need to do:
In [299]: A[0,:][:,0]
Out[299]: matrix([[1]]) # still 2d
=============================
"An array of itself", but I doubt anyone in their right mind would choose that as a model for matrices in a scientific library.
What is, then, the logic to the output I obtained? Why would the first element of a matrix object be itself?
In addition, A[0,:] is not the same as A[0]
In light of these comments let me suggest some clarifications.
A[0]
does not mean 'return the 1st element'. It means select along the 1st axis. For a 1d array that means the 1st item. For a 2d array it means the 1st row. For ndarray
that would be a 1d array, but for a matrix
it is another matrix
. So for a 2d array or matrix, A[i,:]
is the same thing as A[i]
.
A[0]
does not just return itself. It returns a new matrix. Different id
:
In [303]: id(A)
Out[303]: 2994367932
In [304]: id(A[0])
Out[304]: 2994532108
It may have the same data, shape and strides, but it's a new object. It's just as unique as the ith
row of a many row matrix.
Most of the unique matrix
activity is defined in: numpy/matrixlib/defmatrix.py
. I was going to suggest looking at the matrix.__getitem__
method, but most of the action is performed in np.ndarray.__getitem__
.
np.matrix
class was added to numpy
as a convenience for old-school MATLAB programmers. numpy
arrays can have almost any number of dimensions, 0, 1, .... MATLAB allowed only 2, though a release around 2000 generalized it to 2 or more.