How to return a view of several columns in numpy structured array

后端 未结 5 2004
既然无缘
既然无缘 2021-01-31 18:13

I can see several columns (fields) at once in a numpy structured array by indexing with a list of the field names, for example

import n         


        
相关标签:
5条回答
  • 2021-01-31 18:24

    Building on @HYRY's answer, you could also use ndarray's method getfield:

    def fields_view(array, fields):
        return array.getfield(numpy.dtype(
            {name: array.dtype.fields[name] for name in fields}
        ))
    
    0 讨论(0)
  • 2021-01-31 18:25

    I don't think there is an easy way to achieve what you want. In general, you cannot take an arbitrary view into an array. Try the following:

    >>> a
    array([(1.5, 2.5, [[1.0, 2.0], [1.0, 2.0]]),
           (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
           (1.0, 3.0, [[2.0, 6.0], [2.0, 6.0]])], 
          dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
    >>> a.view(float)
    array([ 1.5,  2.5,  1. ,  2. ,  1. ,  2. ,  3. ,  4. ,  4. ,  5. ,  4. ,
            5. ,  1. ,  3. ,  2. ,  6. ,  2. ,  6. ])
    

    The float view of your record array shows you how the actual data is stored in memory. A view into this data has to be expressible as a combination of a shape, strides and offset into the above data. So if you wanted, for instance, a view of 'x' and 'y' only, you could do the following:

    >>> from numpy.lib.stride_tricks import as_strided
    >>> b = as_strided(a.view(float), shape=a.shape + (2,),
                       strides=a.strides + a.view(float).strides)
    >>> b
    array([[ 1.5,  2.5],
           [ 3. ,  4. ],
           [ 1. ,  3. ]])
    

    The as_strided does the same as the perhaps easier to understand:

    >>> bb = a.view(float).reshape(a.shape + (-1,))[:, :2]
    >>> bb
    array([[ 1.5,  2.5],
           [ 3. ,  4. ],
           [ 1. ,  3. ]])
    

    Either of this is a view into a:

    >>> b[0,0] =0
    >>> a
    array([(0.0, 2.5, [[0.0, 2.0], [1.0, 2.0]]),
           (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
           (1.0, 3.0, [[2.0, 6.0], [2.0, 6.0]])], 
          dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
    >>> bb[2, 1] = 0
    >>> a
    array([(0.0, 2.5, [[0.0, 2.0], [1.0, 2.0]]),
           (3.0, 4.0, [[4.0, 5.0], [4.0, 5.0]]),
           (1.0, 0.0, [[2.0, 6.0], [2.0, 6.0]])], 
          dtype=[('x', '<f8'), ('y', '<f8'), ('value', '<f8', (2, 2))])
    

    It would be nice if either of this could be converted into a record array, but numpy refuses to do so, the reason not being all that clear to me:

    >>> b.view([('x',float), ('y',float)])
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    ValueError: new type not compatible with array.
    

    Of course what works (sort of) for 'x' and 'y' would not work, for instance, for 'x' and 'value', so in general the answer is: it cannot be done.

    0 讨论(0)
  • 2021-01-31 18:29

    In my case 'several columns' happens to be equal to two columns of the same data type, where I can use the following function to make a view:

    def make_view(arr, fields, dtype):
        offsets = [arr.dtype.fields[f][1] for f in fields]
        offset = min(offsets)
        stride = max(offsets)
        return np.ndarray((len(arr), 2), buffer=arr, offset=offset, strides=(arr.strides[0], stride-offset), dtype=dtype)
    

    I think this boils down the the same thing @Jamie said, it cannot be done in general, but for two columns of the same dtype it can. The result of this function is not a dict but a good old fashioned numpy array.

    0 讨论(0)
  • 2021-01-31 18:38

    You can create a dtype object contains only the fields that you want, and use numpy.ndarray() to create a view of original array:

    import numpy as np
    strc = np.zeros(3, dtype=[('x', int), ('y', float), ('z', int), ('t', "i8")])
    
    def fields_view(arr, fields):
        dtype2 = np.dtype({name:arr.dtype.fields[name] for name in fields})
        return np.ndarray(arr.shape, dtype2, arr, 0, arr.strides)
    
    v1 = fields_view(strc, ["x", "z"])
    v1[0] = 10, 100
    
    v2 = fields_view(strc, ["y", "z"])
    v2[1:] = [(3.14, 7)]
    
    v3 = fields_view(strc, ["x", "t"])
    
    v3[1:] = [(1000, 2**16)]
    
    print(strc)
    

    here is the output:

    [(10, 0.0, 100, 0L) (1000, 3.14, 7, 65536L) (1000, 3.14, 7, 65536L)]
    
    0 讨论(0)
  • 2021-01-31 18:47

    As of Numpy version 1.13, the code you propose will return a view. See 'NumPy 1.12.0 Release Notes->Future Changes->Multiple-field manipulation of structured arrays' on this page:

    https://docs.scipy.org/doc/numpy-dev/release.html

    0 讨论(0)
提交回复
热议问题