splitting CSV file by columns

后端未结

关注

 4  673

I have a really huge CSV files. There are about 1700 columns and 40000 rows like below:

x,y,z,x1,x2,x3,x4,x5,x6,x7,x8,x9,...(about 1700 more)...,x1700
0,0,0,


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2020-12-20 05:49
              
            
            
                                                                       
Use a small python script like:

fin = 'file_in.csv'
fout1 = 'file_out1.csv'
fout1_fd = open(fout1,'w')
...
lines = []

with open(fin) as fin_fd:
   lines = fin_fd.read().split('\n')

for l in lines:
   l_arr = l.split(',')
   fout1_fd.write(','.join(l_arr[0:3]))        
   fout1_fd.write('\n')   
   ...

...
fout1_fd.close()
...

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  迷失自我        
                
              
                            
                2020-12-20 05:54
              
            
            
                                                                       
A one-line solution for your example data and desired output:

cut -d, -f -3 huge.csv > file1.csv
cut -d, -f 4-1004 huge.csv > file2.csv
cut -d, -f 1005- huge.csv > file3.csv


The cut program is available on most POSIX platforms and is part of GNU Core Utilities. There is also a Windows version.

update in python, since the OP asked for a program in an acceptable language:

# python 3 (or python 2, if you must)
import csv
import fileinput

output_specifications = (  # csv file name, selector function
    ('file1.csv', slice(3)),
    ('file2.csv', slice(3, 1003)),
    ('file3.csv', slice(1003, 1703)),
)
output_row_writers = [
    (
        csv.writer(open(file_name, 'wb'), quoting=csv.QUOTE_MINIMAL).writerow,
        selector,
    ) for file_name, selector in output_specifications
]

reader = csv.reader(fileinput.input())
for row in reader:
    for row_writer, selector in output_row_writers:
        row_writer(row[selector])


This works with the sample data given and can be called with the input.csv as an argument or by piping from stdin.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  醉梦人生        
                
              
                            
                2020-12-20 06:03
              
            
            
                                                                       
You can open the file in Microsoft Excel, delete the extra columns, save as csv for file #1.  Repeat the same procedure for the other 2 tables.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2020-12-20 06:03
              
            
            
                                                                       
I usually use open office ( or microsof excel in case you are using windows) to do that without writing any program and change  the file and save it. Following are two useful links showing how to do that. 

https://superuser.com/questions/407082/easiest-way-to-open-csv-with-commas-in-excel

http://office.microsoft.com/en-us/excel-help/import-or-export-text-txt-or-csv-files-HP010099725.aspx
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复