问题
Is there a more efficient method in python to extract data from a nested python list such as A = array([[array([[12000000]])]], dtype=object)
. I have been using A[0][0][0][0]
, it does not seem to be an efficinet method when you have lots of data like A.
I have also used
numpy.squeeeze(array([[array([[12000000]])]], dtype=object))
but this gives me
array(array([[12000000]]), dtype=object)
PS: The nested array was generated by loadmat()
function in scipy module to load a .mat file which consists of nested structures.
回答1:
Creating such an array is a bit tedious, but loadmat
does it to handle the MATLAB cells and 2d matrix:
In [5]: A = np.empty((1,1),object)
In [6]: A[0,0] = np.array([[1.23]])
In [7]: A
Out[7]: array([[array([[ 1.23]])]], dtype=object)
In [8]: A.any()
Out[8]: array([[ 1.23]])
In [9]: A.shape
Out[9]: (1, 1)
squeeze
compresses the shape, but does not cross the object
boundary
In [10]: np.squeeze(A)
Out[10]: array(array([[ 1.23]]), dtype=object)
but if you have one item in an array (regardless of shape) item()
can extract it. Indexing also works, A[0,0]
In [11]: np.squeeze(A).item()
Out[11]: array([[ 1.23]])
item
again to extract the number from that inner array:
In [12]: np.squeeze(A).item().item()
Out[12]: 1.23
Or we don't even need the squeeze
:
In [13]: A.item().item()
Out[13]: 1.23
loadmat
has a squeeze_me
parameter.
Indexing is just as easy:
In [17]: A[0,0]
Out[17]: array([[ 1.23]])
In [18]: A[0,0][0,0]
Out[18]: 1.23
astype
can also work (though it can be picky about the number of dimensions).
In [21]: A.astype(float)
Out[21]: array([[ 1.23]])
With single item arrays like efficiency isn't much of an issue. All these methods are quick. Things become more complicated when the array has many items, or the items are themselves large.
How to access elements of numpy ndarray?
回答2:
You could use A.all() or A.any() to get a scalar. This would only work if A contains one element.
回答3:
Try A.flatten()[0]
This will flatten the array into a single dimension and extract the first item from it. In your case, the first item is the only item.
回答4:
What worked in my case was the following..
import scipy.io
xcat = scipy.io.loadmat(os.path.join(dir_data, file_name))
pars = xcat['pars'] # Extract numpy.void element from the loadmat object
# Note that you are dealing with a numpy structured array object when you enter pars[0][0].
# Thus you can acces names and all that...
dict_values = [x[0][0] for x in pars[0][0]] # Extract all elements in one go
dict_keys = list(pars.dtype.names) # Extract the corresponding names/tags
dict_xcat = dict(zip(dict_keys, dict_values)) # Pack it up again in a dict
where the idea behind this is.. first extract ALL values I want, and format them in a nice python dict. This prevents me from cumbersome indexing later in the file...
Of course, this is a very specific solution. Since in my case the values I needed were all floats/ints.
来源:https://stackoverflow.com/questions/48233313/how-to-efficiently-extract-values-from-nested-numpy-arrays-generated-by-loadmat