load csv file to numpy and access columns by name

前端未结

关注

 2  780

I have a csv file with headers like:

Given this test.csv file:

\"A\",\"B\",\"C\",\"D\",\"E\",\"F\",\"timestamp\"
611.88243,


                      
              相关标签:


      
      
        
          2条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  太阳男子        
                
              
                            
                2021-01-12 01:59
              
            
            
                                                                       
Using numpy alone,  the options you show are your only options. Either use an ndarray of homogeneous dtype with shape (3,7), or a structured array of (potentially) heterogenous dtype and shape (3,).

If you really want a data structure with labeled columns and shape (3,7), (and lots of other goodies) you could use a 
pandas DataFrame:

In [67]: import pandas as pd
In [68]: df = pd.read_csv('data'); df
Out[68]: 
           A          B     C          D           E          F     timestamp
0  611.88243  9089.5601  5133  864.07514  1715.37476  765.22777  1.291112e+12
1  611.88243  9089.5601  5133  864.07514  1715.37476  765.22777  1.291113e+12
2  611.88243  9089.5601  5133  864.07514  1715.37476  765.22777  1.291121e+12    

In [70]: df['A']
Out[70]: 
0    611.88243
1    611.88243
2    611.88243
Name: A, dtype: float64

In [71]: df.shape
Out[71]: (3, 7)




A pure NumPy/Python alternative would be to use a dict to map the column names to indices:

import numpy as np
import csv
with open(filename) as f:
    reader = csv.reader(f)
    columns = next(reader)
    colmap = dict(zip(columns, range(len(columns))))

arr = np.matrix(np.loadtxt(filename, delimiter=",", skiprows=1))
print(arr[:, colmap['A']])


yields

[[ 611.88243]
 [ 611.88243]
 [ 611.88243]]


This way, arr is a NumPy matrix, with columns that can be accessed by label using the syntax

arr[:, colmap[column_name]]

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  青春惊慌失措        
                
              
                            
                2021-01-12 02:02
              
            
            
                                                                       
Because your data is homogeneous--all the elements are floating point values--you can create a view of the data returned by genfromtxt that is a 2D array.  For example,

In [42]: r = np.genfromtxt("test.csv", delimiter=',', names=True)


Create a numpy array that is a "view" of r.  This is a regular numpy array, but it is created using the data in r:

In [43]: a = r.view(np.float64).reshape(len(r), -1)

In [44]: a.shape
Out[44]: (3, 7)

In [45]: a[:, 0]
Out[45]: array([ 611.88243,  611.88243,  611.88243])

In [46]: r['A']
Out[46]: array([ 611.88243,  611.88243,  611.88243])


r and a refer to the same block of memory:

In [47]: a[0, 0] = -1

In [48]: r['A']
Out[48]: array([  -1.     ,  611.88243,  611.88243])

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复