separate fields by comma using bash

前端 未结 8 1414
滥情空心
滥情空心 2021-01-25 01:07

How do I place commas between fields?

Input data

12123 \'QA test case 1\' \'QA environment\'   
12234 \'UAT test case 1\' \'UAT environment\'  


        
相关标签:
8条回答
  • 2021-01-25 01:43

    Your input data looks very much like an argument list. Therefore, one convenient approach would be to define a bash function that simply returns its argument list as comma separated tokens and invoke that for each line in your file.

    However, the simple implementation below will lose the quotes around the multi-word tokens (but it will place the commas properly). If you need those quotes exactly as they were, it would get a bit more complicated (it's very easy to output every token quoted though):

    #!/bin/bash
    function csv_args() {
        while [ -n "$1" ]; do
            echo -n "$1"
            shift
            [ -n "$1" ] && echo -n ', '
        done
        echo
    }
    
    while read line; do
        eval csv_args $line
    done < /path/to/your/file
    
    0 讨论(0)
  • 2021-01-25 01:44

    A naïve bash implementation that assumes that no (escaped) ' instances ever appear inside a field:

    • Original single-quoting is preserved.
    • Accepts any number of input fields.
    • Any fields may be single-quoted.
    • Caveat: whitespace between fields is normalized (replaced with a single space each), as is whitespace inside a quoted field.

    Input is assumed to come from file file:

    # Read all whitespace-separated tokens (potentially across quoted field boundaries).
    while read -ra tkns; do  
      # Initialize per-line variables.
      numTkns=${#tkns[@]} i=0 inField=0
      # Loop over all tokens.
      for tkn in "${tkns[@]}"; do
        # Determine if we're inside a quoted field.
        [[ $tkn == \'* ]] && inField=1
        [[ $tkn == *\' ]] && inField=0
        # Determine the output separator:
        if (( ++i == numTkns )); then
          sep=$'\n' # last token, terminate output line with \n
        else
          # inside a field: use just a space; between fields: use ', '
          (( inField )) && sep=' ' || sep=', '
        fi
        # Output token and separator.
        printf '%s%s' "$tkn" "$sep"
      done
    done < file
    
    0 讨论(0)
  • 2021-01-25 01:44

    The solution is to walk through every character in the record (while read -n 1), joining every non-space character as one element value and all characters enclosed in single quotes (or quoted single quotes in your option (\'). Everytime you complete a unit (reaching space or newline), append it to an array. When encountering a newline or reaching EOF, end the record and print the record array in your format. The cycle begins again with record array cleared.

    Want a source code? Show me your work first. :)

    0 讨论(0)
  • 2021-01-25 01:50

    Another option is to use a CSV parser:

    ruby -rcsv -ne '
      puts CSV.generate_line(
             CSV.parse_line($_.strip, {:col_sep => " ", :quote_char => "'\''"}
           ), {:force_quotes => 1}) 
    '  file
    
    "12123","QA test case 1","QA environment"
    "12234","UAT test case 1","UAT environment"
    
    0 讨论(0)
  • 2021-01-25 01:51
    $ sed "s/ '/,&/g" file
    12123, 'QA test case 1', 'QA environment'
    12234, 'UAT test case 1', 'UAT environment'
    
    0 讨论(0)
  • 2021-01-25 02:01

    Try this awk:

    awk -F" '" '{ print $1, $2, $3 }' OFS=", '" data
    

    or using a BEGIN block:

    awk -F" '" 'BEGIN {OFS="," FS} { print $1, $2, $3 }' data
    

    In either case, the FS is being set to ' (space + "'") and OFS is being set to "," + '. It's based on the assumption ' is a validly unique field separator and all input data is formatted/arranged as in the question.

    0 讨论(0)
提交回复
热议问题