Convert text file into a comma delimited string

前端 未结 5 685
南旧
南旧 2021-02-06 17:25

I don\'t seem to locate an SO question that matches this exact problem.

I have a text file that has one text token per line, without any commas, tabs, or quotes. I want

相关标签:
5条回答
  • 2021-02-06 17:53

    One way with Awk would be to reset the RS and treat the records as separated by blank lines. This would handle words with spaces and format them in CSV format as expected.

    awk '{$1=$1}1' FS='\n' OFS=',' RS= file
    

    The {$1=$1} is a way to reconstruct the fields in each line($0) of the file based on modifications to Field (FS/OFS) and/or Record separators(RS/ORS). The trailing 1 is to print every line with the modifications done inside {..}.

    0 讨论(0)
  • 2021-02-06 18:02

    With Perl one-liner:

    $ cat csv_2_text
    one
    two
    three
    $ perl -ne '{ chomp; push(@lines,$_) } END { $x=join(",",@lines);  print "$x" }' csv_2_text
    one,two,three
    
    $ perl -ne ' { chomp; $_="$_," if not eof ;printf("%s",$_) } ' csv_2_text
    one,two,three
    $
    

    From @codeforester

    $ perl -ne 'BEGIN { my $delim = "" } { chomp; printf("%s%s", $delim, $_); $delim="," } END { printf("\n") }' csv_2_text
    one,two,three
    $
    
    0 讨论(0)
  • 2021-02-06 18:13

    The usual command to do this is paste

    csv_string=$(paste -sd, file.txt)
    
    0 讨论(0)
  • 2021-02-06 18:13

    You can do it entirely with bash parameter expansion operators instead of using tr and sed.

    csv_string=$(<file)               # read file into variable
    csv_string=${csv_string//$'\n'/,} # replace \n with ,
    csv_string=${csv_string%,}        # remove trailing comma
    
    0 讨论(0)
  • 2021-02-06 18:14

    Tested the four approaches on a Linux box - Bash only, paste, awk, Perl, as well as the tr | sed approach shown in the question:

    #!/bin/bash
    
    # generate test data
    seq 1 10000 > test.file
    
    times=${1:-50}
    
    printf '%s\n' "Testing paste solution"
    time {
        for ((i=0; i < times; i++)); do
          csv_string=$(paste -sd, test.file)
        done
    }
    
    printf -- '----\n%s\n' "Testing pure Bash solution"
    time {
        for ((i=0; i < times; i++)); do
          csv_string=$(<test.file)          # read file into variable
          csv_string=${csv_string//$'\n'/,} # replace \n with ,
          csv_string=${csv_strings%,}       # remove trailing comma
        done
    }
    
    printf -- '----\n%s\n' "Testing Awk solution"
    time {
        for ((i=0; i < times; i++)); do
          csv_string=$(awk '{$1=$1}1' FS='\n' OFS=',' RS= test.file)
        done
    }
    
    printf -- '----\n%s\n' "Testing Perl solution"
    time {
        for ((i=0; i < times; i++)); do
          csv_string=$(perl -ne '{ chomp; $_="$_," if not eof; printf("%s",$_) }' test.file)
        done
    }
    
    printf -- '----\n%s\n' "Testing tr | sed solution"
    time {
        for ((i=0; i < times; i++)); do
          csv_string=$(tr '\n' ',' < test.file | sed 's/,$//')
        done
    }
    

    Surprisingly, the Bash only solution does quite poorly. paste comes on top, followed by tr | sed, Awk, and perl:

    Testing paste solution
    
    real    0m0.109s
    user    0m0.052s
    sys 0m0.075s
    ----
    Testing pure Bash solution
    
    real    1m57.777s
    user    1m57.113s
    sys 0m0.341s
    ----
    Testing Awk solution
    
    real    0m0.221s
    user    0m0.152s
    sys 0m0.077s
    ----
    Testing Perl solution
    
    real    0m0.424s
    user    0m0.388s
    sys 0m0.080s
    ----
    Testing tr | sed solution
    
    real    0m0.162s
    user    0m0.092s
    sys 0m0.141s
    

    For some reasons, csv_string=${csv_string//$'\n'/,} hangs on macOS Mojave running Bash 4.4.23.


    Related posts:

    • How to join multiple lines of file names into one with custom delimiter?
    • Concise and portable “join” on the Unix command-line
    • Turning multi-line string into single comma-separated
    0 讨论(0)
提交回复
热议问题