问题
I am working with a unix shell script that does genome construction then creates a phylogeny. Depending on the genome assembler you use, the final output (the phylogeny) may change. I wish to compare the effects of using various genome assemblers. I have developed some metrics to compare them on, but I need help organizing them so I can run useful analyses. I would like to import my data into excel in columns.
This is the script I am using to output data:
echo "Enter the size (Mb or Gb) of your data set:"
read SIZEOFDATASET
echo "The size of your data set is $SIZEOFDATASET"
echo "Size of Data Set:" >> metrics_file.txt
echo $SIZEOFDATASET >> metrics_file.txt
echo "Enter the name of your assembler"
read NAMEOFASSEMBLER
echo "You are using $NAMEOFASSEMBLER as your assembler"
echo "Name of Assembler:" >> metrics_file.txt
echo "$NAMEOFASSEMBLER" >> metrics_file.txt
echo "Time:" >> metrics_file.txt
The output comes out like this currently:
Size of Data Set:
387 Mb
Name of Assembler:
Velvet
Genome Size:
1745690
Time:
I want it to look something like this:
Thanks in Advance!
回答1:
#!/bin/sh
in_file=in.txt # Input file
params=3 # Parameters count
res_file=$(mktemp) # Temporary file
sep=' ' # Separator character
# Print header
cnt=0
for i in $(cat $in_file | head -$((params*2))); do
if [ $((cnt % 2)) -eq 0 ]; then
echo $i
fi
cnt=$((cnt+1))
done | sed ":a;N;\$!ba;s/\n/$sep/g" >>$res_file
# Parse and print values
cnt=0
for i in $(cat $in_file); do
# Print values, skip param names
if [ $((cnt % 2)) -eq 1 ]; then
echo -n $i >>$res_file
fi
if [ $(((cnt+1) % (params*2))) -eq 0 ]; then
# Values line is finished, print newline
echo >>$res_file
elif [ $((cnt % 2)) -eq 1 ]; then
# More values expected to be printed on this line
echo -n "$sep" >>$res_file
fi
cnt=$((cnt+1))
done
# Make nice table format
cat $res_file | column -t
rm -f $res_file
Explanation
This scripts assumes that:
- input file is called "in.txt" (see in_file variable)
- input file uses format you described in question
- result table should have 3 columns (see params variable)
Most of the code is just parsing of your input data format. Actual column formatting is done by column
tool.
If you want to export this table to excel, just change sep variable to ','
and save result output to .csv file. This file can be easily imported in excel application.
Example
Input file:
Size
387
Name
Velvet
Time
13
Size
31415
Name
Minia
Time
18
Size
31337
Name
ABCDEF
Time
42
Script output:
Size Name Time
387 Velvet 13
31415 Minia 18
31337 ABCDEF 42
回答2:
Sam's answer provided exactly what you are looking for, but you can also consider making it more streamlined, avoiding the need to convert the metrics file into a table, and just write the table right away. For example, write a single script like this, user_input.bash:
echo "Enter the size (Mb or Gb) of your data set:" > /dev/stderr
read SIZEOFDATASET
echo "The size of your data set is $SIZEOFDATASET" > /dev/stderr
echo "Enter the name of your assembler" > /dev/stderr
read NAMEOFASSEMBLER
echo "You are using $NAMEOFASSEMBLER as your assembler" > /dev/stderr
echo "Enter Time:" > /dev/stderr
read TIME
echo "You entered Time:" $TIME > /dev/stderr
echo "Name Size Time"
echo $NAMEOFASSEMBLER $SIZEOFDATASET $TIME
To use program:
./user_input.bash > metrics.file.1.txt
./user_input.bash > metrics.file.2.txt
./user_input.bash > metrics.file.3.txt
...
The collect all results:
head -n 1 metrics.file.1.txt > allmetrics.txt
tail -n +2 -q metrics.file.*.txt > allmetrics.txt
HTH
来源:https://stackoverflow.com/questions/28727449/organizing-the-output-of-my-shell-script-into-tables-within-the-text-file