Organizing the output of my shell script into tables within the text file

问题

I am working with a unix shell script that does genome construction then creates a phylogeny. Depending on the genome assembler you use, the final output (the phylogeny) may change. I wish to compare the effects of using various genome assemblers. I have developed some metrics to compare them on, but I need help organizing them so I can run useful analyses. I would like to import my data into excel in columns.

This is the script I am using to output data:

echo "Enter the size (Mb or Gb) of your data set:"
read SIZEOFDATASET
echo "The size of your data set is $SIZEOFDATASET"
echo "Size of Data Set:" >> metrics_file.txt 
echo $SIZEOFDATASET >> metrics_file.txt

echo "Enter the name of your assembler"
read NAMEOFASSEMBLER
echo "You are using $NAMEOFASSEMBLER as your assembler"
echo "Name of Assembler:" >> metrics_file.txt 
echo "$NAMEOFASSEMBLER" >> metrics_file.txt
echo "Time:" >> metrics_file.txt

The output comes out like this currently:

Size of Data Set:
387 Mb
Name of Assembler:
Velvet
Genome Size:
1745690
Time:

I want it to look something like this:

Thanks in Advance!

回答1:

#!/bin/sh

in_file=in.txt      # Input file
params=3            # Parameters count
res_file=$(mktemp)  # Temporary file
sep=' '             # Separator character

# Print header
cnt=0
for i in $(cat $in_file | head -$((params*2))); do
    if [ $((cnt % 2)) -eq 0 ]; then
        echo $i
    fi
    cnt=$((cnt+1))
done | sed ":a;N;\$!ba;s/\n/$sep/g" >>$res_file

# Parse and print values
cnt=0
for i in $(cat $in_file); do
    # Print values, skip param names
    if [ $((cnt % 2)) -eq 1 ]; then
        echo -n $i >>$res_file
    fi

    if [ $(((cnt+1) % (params*2))) -eq 0 ]; then
        # Values line is finished, print newline
        echo >>$res_file
    elif [ $((cnt % 2)) -eq 1 ]; then
        # More values expected to be printed on this line
        echo -n "$sep" >>$res_file
    fi

    cnt=$((cnt+1))
done

# Make nice table format
cat $res_file | column -t
rm -f $res_file

Explanation

This scripts assumes that:

input file is called "in.txt" (see in_file variable)
input file uses format you described in question
result table should have 3 columns (see params variable)

Most of the code is just parsing of your input data format. Actual column formatting is done by column tool.

If you want to export this table to excel, just change sep variable to ',' and save result output to .csv file. This file can be easily imported in excel application.

Example

Input file:

Size
387
Name
Velvet
Time
13
Size
31415
Name
Minia
Time
18
Size
31337
Name
ABCDEF
Time
42

Script output:

Size   Name    Time
387    Velvet  13
31415  Minia   18
31337  ABCDEF  42

回答2:

Sam's answer provided exactly what you are looking for, but you can also consider making it more streamlined, avoiding the need to convert the metrics file into a table, and just write the table right away. For example, write a single script like this, user_input.bash:

echo "Enter the size (Mb or Gb) of your data set:" > /dev/stderr
read SIZEOFDATASET
echo "The size of your data set is $SIZEOFDATASET" > /dev/stderr
echo "Enter the name of your assembler" > /dev/stderr
read NAMEOFASSEMBLER
echo "You are using $NAMEOFASSEMBLER as your assembler" > /dev/stderr
echo "Enter Time:" > /dev/stderr
read TIME
echo "You entered Time:" $TIME > /dev/stderr
echo "Name Size Time"
echo $NAMEOFASSEMBLER $SIZEOFDATASET $TIME

To use program:

 ./user_input.bash > metrics.file.1.txt
    ./user_input.bash > metrics.file.2.txt
    ./user_input.bash > metrics.file.3.txt
    ...

The collect all results:

head -n 1  metrics.file.1.txt > allmetrics.txt
tail -n +2 -q metrics.file.*.txt > allmetrics.txt

HTH

来源：https://stackoverflow.com/questions/28727449/organizing-the-output-of-my-shell-script-into-tables-within-the-text-file

标签

shell

unix

text-files

bioinformatics

genome