Convert structured array with various numeric data types to regular array

后端 未结 4 1272
天命终不由人
天命终不由人 2021-01-06 11:24

Suppose I have a NumPy structured array with various numeric datatypes. As a basic example,

my_data = np.array( [(17, 182.1),  (19, 175.6)],  dtype=\'i2,f4\         


        
相关标签:
4条回答
  • 2021-01-06 12:05

    A variation on Warren's answer (which copies data by field):

    x = np.empty((my_data.shape[0],len(my_data.dtype)),dtype='f4')
    for i,n in enumerate(my_data.dtype.names):
        x[:,i]=my_data[n]
    

    Or you could iterate by row. r is a tuple. It has to be converted to a list in order to fill a row of x. With many rows and few fields this will be slower.

    for i,r in enumerate(my_data):
        x[i,:]=list(r)
    

    It may be instructive to try x.data=r.data, and get an error: AttributeError: not enough data for array. x data is a buffer with 4 floats. my_data is a buffer with 2 tuples, each of which contains an int and a float (or sequence of [int float int float]). my_data.itemsize==6. One way or other, the my_data has to be converted to all floats, and the tuple grouping removed.

    But using astype as Jaime shows does work:

    x.data=my_data.astype('f4,f4').data
    

    In quick tests using a 1000 item array with 5 fields, copying field by field is just as fast as using astype.

    0 讨论(0)
  • 2021-01-06 12:10

    You can do it easily with Pandas:

    >>> import pandas as pd
    >>> pd.DataFrame(my_data).values
    array([[  17.       ,  182.1000061],
           [  19.       ,  175.6000061]], dtype=float32)
    
    0 讨论(0)
  • 2021-01-06 12:12

    Here's one way (assuming my_data is a one-dimensional structured array):

    In [26]: my_data
    Out[26]: 
    array([(17, 182.10000610351562), (19, 175.60000610351562)], 
          dtype=[('f0', '<i2'), ('f1', '<f4')])
    
    In [27]: np.column_stack(my_data[name] for name in my_data.dtype.names)
    Out[27]: 
    array([[  17.       ,  182.1000061],
           [  19.       ,  175.6000061]], dtype=float32)
    
    0 讨论(0)
  • 2021-01-06 12:16

    The obvious way works:

    >>> my_data
    array([(17, 182.10000610351562), (19, 175.60000610351562)],
          dtype=[('f0', '<i2'), ('f1', '<f4')])
    >>> n = len(my_data.dtype.names)  # n == 2
    >>> my_data.astype(','.join(['f4']*n))
    array([(17.0, 182.10000610351562), (19.0, 175.60000610351562)],
          dtype=[('f0', '<f4'), ('f1', '<f4')])
    >>> my_data.astype(','.join(['f4']*n)).view('f4')
    array([  17.       ,  182.1000061,   19.       ,  175.6000061], dtype=float32)
    >>> my_data.astype(','.join(['f4']*n)).view('f4').reshape(-1, n)
    array([[  17.       ,  182.1000061],
           [  19.       ,  175.6000061]], dtype=float32)
    
    0 讨论(0)
提交回复
热议问题