IFS separate a string like “Hello”,“World”,“this”,“is, a boring”, “line”

前端未结

关注

 3  1568

I\'m trying to parse a .csv file and I have some problems with IFS. The file contains lines like this:

\"Hello\",\"World\",\"this\",\"is, a boring\",\"line\"


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  慢半拍i        
                
              
                            
                2021-01-29 07:20
              
            
            
                                                                       
bashlib provides a csvline function. Assuming you've installed it somewhere in your PATH:

line='"Hello","World","this","is, a boring","line"'

source bashlib
csvline <<<"$line"
printf '%s\n' "${CSVLINE[@]}"


...output from the above being:

Hello
World
this
is, a boring
line




To quote the implementation (which is copyright lhunath, the below text being taken from this specific revision of the relevant git repo):

#  _______________________________________________________________________
# |__ csvline ____________________________________________________________|
#
#       csvline [-d delimiter] [-D line-delimiter]
#
# Parse a CSV record from standard input, storing the fields in the CSVLINE array.
#
# By default, a single line of input is read and parsed into comma-delimited fields.
# Fields can optionally contain double-quoted data, including field delimiters.
#
# A different field delimiter can be specified using -d.  You can use -D
# to change the definition of a "record" (eg. to support NULL-delimited records).
#
csvline() {
    CSVLINE=()
    local line field quoted=0 delimiter=, lineDelimiter=$'\n' c
    local OPTIND=1 arg
    while getopts :d: arg; do
        case $arg in
            d) delimiter=$OPTARG ;;
        esac
    done

    IFS= read -d "$lineDelimiter" -r line || return
    while IFS= read -rn1 c; do
        case $c in
            \")
                (( quoted = !quoted ))
                continue ;;
            $delimiter)
                if (( ! quoted )); then
                    CSVLINE+=( "$field" ) field=
                    continue
                fi ;;
        esac
        field+=$c
    done <<< "$line"
    [[ $field ]] && CSVLINE+=( "$field" ) ||:
} # _____________________________________________________________________

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  自闭症患者        
                
              
                            
                2021-01-29 07:26
              
            
            
                                                                       
give this a try:

sed 's/","/"\n"/g' <<<"${line}"


sed has a search and replace command s which is using regex to search pattern.

The regex replaces , in "," with new line char.

As a consequence each element is on a separate line.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  借酒劲吻你        
                
              
                            
                2021-01-29 07:33
              
            
            
                                                                       
You may wish to use the gawk with FPAT to define what makes a valid string -

Input :


  "hello","world","this,is"    


Script :

gawk -n 'BEGIN{FS=",";OFS="\n";FPAT="([^,]+)|(\"[^\"]+\")"}{$1=$1;print $0}' somefile.csv


Output :


  "hello"

  "world"

  "this,is"

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复