Convert text file into a comma delimited string

ぐ巨炮叔叔 提交于 2019-12-05 02:35:51

问题


I don't seem to locate an SO question that matches this exact problem.

I have a text file that has one text token per line, without any commas, tabs, or quotes. I want to create a comma delimited string based on the file content.

Input:

one
two
three

Output:

one,two,three

I am using this command:

csv_string=$(tr '\n' ',' < file | sed 's/,$//')

Is there a more efficient way to do this?


回答1:


The usual command to do this is paste

csv_string=$(paste -sd, file.txt)



回答2:


One way with Awk would be to reset the RS and treat the records as separated by blank lines. This would handle words with spaces and format them in CSV format as expected.

awk '{$1=$1}1' FS='\n' OFS=',' RS= file

The {$1=$1} is a way to reconstruct the fields in each line($0) of the file based on modifications to Field (FS/OFS) and/or Record separators(RS/ORS). The trailing 1 is to print every line with the modifications done inside {..}.




回答3:


You can do it entirely with bash parameter expansion operators instead of using tr and sed.

csv_string=$(<file)               # read file into variable
csv_string=${csv_string//$'\n'/,} # replace \n with ,
csv_string=${csv_string%,}        # remove trailing comma



回答4:


With Perl one-liner:

$ cat csv_2_text
one
two
three
$ perl -ne '{ chomp; push(@lines,$_) } END { $x=join(",",@lines);  print "$x" }' csv_2_text
one,two,three

$ perl -ne ' { chomp; $_="$_," if not eof ;printf("%s",$_) } ' csv_2_text
one,two,three
$

From @codeforester

$ perl -ne 'BEGIN { my $delim = "" } { chomp; printf("%s%s", $delim, $_); $delim="," } END { printf("\n") }' csv_2_text
one,two,three
$



回答5:


Tested the four approaches on a Linux box - Bash only, paste, awk, Perl, as well as the tr | sed approach shown in the question:

#!/bin/bash

# generate test data
seq 1 10000 > test.file

times=${1:-50}

printf '%s\n' "Testing paste solution"
time {
    for ((i=0; i < times; i++)); do
      csv_string=$(paste -sd, test.file)
    done
}

printf -- '----\n%s\n' "Testing pure Bash solution"
time {
    for ((i=0; i < times; i++)); do
      csv_string=$(<test.file)          # read file into variable
      csv_string=${csv_string//$'\n'/,} # replace \n with ,
      csv_string=${csv_strings%,}       # remove trailing comma
    done
}

printf -- '----\n%s\n' "Testing Awk solution"
time {
    for ((i=0; i < times; i++)); do
      csv_string=$(awk '{$1=$1}1' FS='\n' OFS=',' RS= test.file)
    done
}

printf -- '----\n%s\n' "Testing Perl solution"
time {
    for ((i=0; i < times; i++)); do
      csv_string=$(perl -ne '{ chomp; $_="$_," if not eof; printf("%s",$_) }' test.file)
    done
}

printf -- '----\n%s\n' "Testing tr | sed solution"
time {
    for ((i=0; i < times; i++)); do
      csv_string=$(tr '\n' ',' < test.file | sed 's/,$//')
    done
}

Surprisingly, the Bash only solution does quite poorly. paste comes on top, followed by tr | sed, Awk, and perl:

Testing paste solution

real    0m0.109s
user    0m0.052s
sys 0m0.075s
----
Testing pure Bash solution

real    1m57.777s
user    1m57.113s
sys 0m0.341s
----
Testing Awk solution

real    0m0.221s
user    0m0.152s
sys 0m0.077s
----
Testing Perl solution

real    0m0.424s
user    0m0.388s
sys 0m0.080s
----
Testing tr | sed solution

real    0m0.162s
user    0m0.092s
sys 0m0.141s

For some reasons, csv_string=${csv_string//$'\n'/,} hangs on macOS Mojave running Bash 4.4.23.


Related posts:

  • How to join multiple lines of file names into one with custom delimiter?
  • Concise and portable “join” on the Unix command-line
  • Turning multi-line string into single comma-separated


来源:https://stackoverflow.com/questions/53093449/convert-text-file-into-a-comma-delimited-string

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!