parse a csv file that contains commans in the fields with awk

后端未结

关注

 4  925

i have to use awk to print out 4 different columns in a csv file. The problem is the strings are in a $x,xxx.xx format. When I run the regular awk command.

awk -


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  日久生厌        
                
              
                            
                2021-02-05 11:17
              
            
            
                                                                       
I think what you're saying is that you want to split the input into CSV fields while not getting tripped up by the commas inside the double quotes.  If so...

First, use "," as the field separator, like this:

awk -F'","' '{print $1}'


But then you'll still end up with a stray double-quote at the beginning of $1 (and at the end of the last field).  Handle that by stripping quotes out with gsub, like this:

awk -F'","' '{x=$1; gsub("\"","",x); print x}'


Result:

echo '"abc,def","ghi,xyz"' | awk -F'","' '{x=$1; gsub("\"","",x); print x}'

abc,def

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤独总比滥情好        
                
              
                            
                2021-02-05 11:23
              
            
            
                                                                       
The data file:

$ cat data.txt
"$307.00","$132.34","$30.23"


The AWK script:

$ cat csv.awk
BEGIN { RS = "," }
{ gsub("\"", "", $1);
  print $1 }


The execution:

$ awk -f csv.awk data.txt
$307.00
$132.34
$30.23

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  孤城傲影        
                
              
                            
                2021-02-05 11:24
              
            
            
                                                                       
Oddly enough I had to tackle this problem some time ago and I kept the code around to do it.  You almost had it, but you need to get a bit tricky with your field separator(s).

awk -F'","|^"|"$' '{print $2}' testfile.csv 


Input

# cat testfile.csv
"$141,818.88","$52,831,578.53","$52,788,069.53"
"$2,558.20","$482,619.11","$9,687,142.69"
"$786.48","$8,568,159.41","$159,180,818.00"


Output

# awk -F'","|^"|"$' '{print $2}' testfile.csv
$141,818.88
$2,558.20
$786.48


You'll note that the "first" field is actually $2 because of the field separator ^". Small price to pay for a short 1-liner if you ask me.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  长发绾君心        
                
              
                            
                2021-02-05 11:26
              
            
            
                                                                       
In order to let awk handle quoted fields that contain the field separator, you can use a small script I wrote called csvquote. It temporarily replaces the offending commas with nonprinting characters, and then you restore them at the end of your pipeline. Like this:

csvquote testfile.csv | awk -F, {print $1} | csvquote -u


This would also work with any other UNIX text processing program like cut:

csvquote testfile.csv | cut -d, -f1 | csvquote -u


You can get the csvquote code here: https://github.com/dbro/csvquote
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复