Removing in-field quotes in csv file

后端未结

关注

 3  737

Let\'s say we have a comma separated file (csv) like this:

\"name of movie\",\"starring\",\"director\",\"release year\"
\"dark knight rises\",\"christian bale,


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  青春惊慌失措        
                
              
                            
                2021-01-29 04:47
              
            
            
                                                                       
With awk you can do something like:

awk -v Q='"' '{ gsub("[\"']","") ; gsub(",",Q "," Q) ; print Q $0 Q }'

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野性不改        
                
              
                            
                2021-01-29 04:56
              
            
            
                                                                       
How about

import csv

def remove_quotes(s):
    return ''.join(c for c in s if c not in ('"', "'"))

with open("fixquote.csv","rb") as infile, open("fixed.csv","wb") as outfile:
    reader = csv.reader(infile)
    writer = csv.writer(outfile, quoting=csv.QUOTE_ALL)
    for line in reader:
        writer.writerow([remove_quotes(elem) for elem in line])


which produces

~/coding$ cat fixed.csv 
"name of movie","starring","director","release year"
"dark knight rises","christian bale, anna hathaway","christopher nolan","2012"
"the dark knight","christian bale, heath ledger","christopher nolan","2008"
"The day when earth stood still","Michael Rennie,the strong man","robert wise","1951"
"the gladiator","russel the awesome crowe","ridley scott","2000"


BTW, you might want to check the spelling of some of those names..
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别那么骄傲        
                
              
                            
                2021-01-29 04:59
              
            
            
                                                                       
Split the values into an array. Iterate through the array removing any quotes, other than the first and last character. Hope it helps.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复