Convert structured array with various numeric data types to regular array

后端未结

关注

 4  1272

Suppose I have a NumPy structured array with various numeric datatypes. As a basic example,

my_data = np.array( [(17, 182.1),  (19, 175.6)],  dtype=\'i2,f4\


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  不思量自难忘°        
                
              
                            
                2021-01-06 12:05
              
            
            
                                                                       
A variation on Warren's answer (which copies data by field):

x = np.empty((my_data.shape[0],len(my_data.dtype)),dtype='f4')
for i,n in enumerate(my_data.dtype.names):
    x[:,i]=my_data[n]


Or you could iterate by row.  r is a tuple.  It has to be converted to a list in order to fill a row of x.   With many rows and few fields this will be slower.

for i,r in enumerate(my_data):
    x[i,:]=list(r)


It may be instructive to try x.data=r.data, and get an error: AttributeError: not enough data for array.  x data is a buffer with 4 floats.  my_data is a buffer with 2 tuples, each of which contains an int and a float (or sequence of [int float int float]).  my_data.itemsize==6.  One way or other, the my_data has to be converted to all floats, and the tuple grouping removed.

But using astype as Jaime shows does work:

x.data=my_data.astype('f4,f4').data


In quick tests using a 1000 item array with 5 fields, copying field by field is just as fast as using astype.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2021-01-06 12:10
              
            
            
                                                                       
You can do it easily with Pandas:

>>> import pandas as pd
>>> pd.DataFrame(my_data).values
array([[  17.       ,  182.1000061],
       [  19.       ,  175.6000061]], dtype=float32)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  广开言路        
                
              
                            
                2021-01-06 12:12
              
            
            
                                                                       
Here's one way (assuming my_data is a one-dimensional structured array):

In [26]: my_data
Out[26]: 
array([(17, 182.10000610351562), (19, 175.60000610351562)], 
      dtype=[('f0', '<i2'), ('f1', '<f4')])

In [27]: np.column_stack(my_data[name] for name in my_data.dtype.names)
Out[27]: 
array([[  17.       ,  182.1000061],
       [  19.       ,  175.6000061]], dtype=float32)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  逝去的感伤        
                
              
                            
                2021-01-06 12:16
              
            
            
                                                                       
The obvious way works:

>>> my_data
array([(17, 182.10000610351562), (19, 175.60000610351562)],
      dtype=[('f0', '<i2'), ('f1', '<f4')])
>>> n = len(my_data.dtype.names)  # n == 2
>>> my_data.astype(','.join(['f4']*n))
array([(17.0, 182.10000610351562), (19.0, 175.60000610351562)],
      dtype=[('f0', '<f4'), ('f1', '<f4')])
>>> my_data.astype(','.join(['f4']*n)).view('f4')
array([  17.       ,  182.1000061,   19.       ,  175.6000061], dtype=float32)
>>> my_data.astype(','.join(['f4']*n)).view('f4').reshape(-1, n)
array([[  17.       ,  182.1000061],
       [  19.       ,  175.6000061]], dtype=float32)

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复