How do I place commas between fields?
Input data
12123 \'QA test case 1\' \'QA environment\'
12234 \'UAT test case 1\' \'UAT environment\'
Your input data looks very much like an argument list. Therefore, one convenient approach would be to define a bash function that simply returns its argument list as comma separated tokens and invoke that for each line in your file.
However, the simple implementation below will lose the quotes around the multi-word tokens (but it will place the commas properly). If you need those quotes exactly as they were, it would get a bit more complicated (it's very easy to output every token quoted though):
#!/bin/bash
function csv_args() {
while [ -n "$1" ]; do
echo -n "$1"
shift
[ -n "$1" ] && echo -n ', '
done
echo
}
while read line; do
eval csv_args $line
done < /path/to/your/file
A naïve bash
implementation that assumes that no (escaped) '
instances ever appear inside a field:
Input is assumed to come from file file
:
# Read all whitespace-separated tokens (potentially across quoted field boundaries).
while read -ra tkns; do
# Initialize per-line variables.
numTkns=${#tkns[@]} i=0 inField=0
# Loop over all tokens.
for tkn in "${tkns[@]}"; do
# Determine if we're inside a quoted field.
[[ $tkn == \'* ]] && inField=1
[[ $tkn == *\' ]] && inField=0
# Determine the output separator:
if (( ++i == numTkns )); then
sep=$'\n' # last token, terminate output line with \n
else
# inside a field: use just a space; between fields: use ', '
(( inField )) && sep=' ' || sep=', '
fi
# Output token and separator.
printf '%s%s' "$tkn" "$sep"
done
done < file
The solution is to walk through every character in the record (while read -n 1
), joining every non-space character as one element value and all characters enclosed in single quotes (or quoted single quotes in your option (\'
). Everytime you complete a unit (reaching space or newline), append it to an array. When encountering a newline or reaching EOF, end the record and print the record array in your format. The cycle begins again with record array cleared.
Want a source code? Show me your work first. :)
Another option is to use a CSV parser:
ruby -rcsv -ne '
puts CSV.generate_line(
CSV.parse_line($_.strip, {:col_sep => " ", :quote_char => "'\''"}
), {:force_quotes => 1})
' file
"12123","QA test case 1","QA environment"
"12234","UAT test case 1","UAT environment"
$ sed "s/ '/,&/g" file
12123, 'QA test case 1', 'QA environment'
12234, 'UAT test case 1', 'UAT environment'
Try this awk:
awk -F" '" '{ print $1, $2, $3 }' OFS=", '" data
or using a BEGIN
block:
awk -F" '" 'BEGIN {OFS="," FS} { print $1, $2, $3 }' data
In either case, the FS
is being set to '
(space + "'") and OFS
is being set to "," + '
. It's based on the assumption '
is a validly unique field separator and all input data is formatted/arranged as in the question.