Read an input number and redirect it to a column in a file

问题

I have a file with a table like this:

I have done that my program returns me the values (max, min and mean) from the line of the gene that I'm looking for. Now my goal is the same, but instead of words, the user will print the number of the column. Ego to obtain these values but only from one column.

Here there is my code:

#!/bin/bash

FICHERO="affy.txt"

function OPTIONS
{
   echo "_____________OPTIONS_____________"
   echo ""
   echo "   1. Select one gene and its test results"
   echo "  2. Select one column and its test results"
   echo "               3. Exit"
}

function gene
{
   if [ -e "affy.txt" ]; then  # Si el fichero existe...
      echo "Print the name of the gene you are looking for: "
      read -p "Name:" NAME
      OLDIFS=$IFS
      IFS=","; 
      # Calcular max, min y mean.
      min=` grep -m1 "$NAME" affy.txt |tr -s ',' '.' | tr -s ' ' '\n' | cut -d' ' -f3- |  sort -n | head -1`
      max=` grep -m1 "$NAME" affy.txt  | tr -s '  ' ' ' |tr -s ',' '.' | cut -d ' ' -f3- | tr -s ' ' '\n' | sort -n | tail -1`
      mean=` grep -m1 "$NAME" affy.txt | tr -s '  ' ' ' |tr -s ',' '.' | cut -d ' ' -f3- | tr -s ' ' '\n' | awk '{sum+=$1} END {print sum/NR}'`

      echo "Min value: "$min
      echo "Max value: "$max
      echo "Mean value: "$mean


   else
      echo "Invalid gene name!"
   fi

   echo
}

function column
{   
   if [ -e $FICHERO ]; then
      echo "Print the column number you are looking for: "
      read -p "Name: " NAME


   else
      echo "El fichero no existe o no contiene entradas en la agenda"
   fi
}

opc=0
exit=5

while [ $opc -ne $exit ];
do   
   clear
   OPTIONS  # Dibujamos el menu en pantalla
   read -p "Opcion:..." opc  # Escogemos la opcion deseada

   if [ $opc -ge 1 ] && [ $opc -le 5 ]; then
      clear
      case $opc in   # Acciones para las diferentes opciones del menu

         1)gene   
            ;;

         2)column
            ;;
      esac
  else
  echo "Insert a correct option!!"

  fi
  echo "Press any key..."
  read
  done

OPTION NUMBER 1 IS WORKING.

I tried something like this in the function named column, but it doesn't works...:

    function column
{   
   if [ -s $FICHERO ]; then
      echo "Print the column number you are looking for: "
      read -p "column: " column
      for i in "$column"
      do
         col+="${i#-}"","
         echo "You are working with column number:" $col
      done

   else
      echo "El fichero no existe o no contiene entradas en la agenda"
   fi

   if [ "$col" = "" ]; then
          echo "Insert Columns please!"
   else
      for i in $col; 
      do 
      echo
      echo minim columna= `tr -s ',' '.' affy.txt | tr -s ' ' '\n' | cut -d' ' -f"$col" |  sort -n | head -1`
      echo maxim columna "$i"= `grep "$col" affy.txt | tr -s '  ' ' ' |tr -s ',' '.' | cut -d ' ' -f"$i" | sort -n | tail -1`
      echo average columna "$i"= `grep "$col" affy.txt | tr -s '  ' ' ' |tr -s ',' '.' | cut -d ' ' -f"$i" | awk '{sum+=$0} END {print sum/NR}'`

      shift
      done
   fi

回答1:

Awk is a good tool for such column manipulation exercises; the following block shows how to get all the information on the column COL using awk:

  awk 'BEGIN{min=999;sum=0} # Set initial values
  { if(NR <= 1){ next }     # Skip first line which is the column name
    if ($COL<min){min=$COL} # Store minimum so far
    if($COL>max){max=$COL}  # Store maximum so far
    sum+=$COL; }            # Store sum of the column
    END { print "minim columna="min;
        print "maxim columna="max;
        print "average columna="sum/(NR-1);}' file.txt;

Note that because we skip the header line, we calculate the average using sum/(NR-1) not sum/NR.

With your program it is important to be able to get the value of COL from the bash script. This can be done by using awk's -v parameter:

awk -v "COL=$col" 'BEGIN{ ...

Put this together in a simplified columns function yields:

#!/bin/bash

FICHERO="affy.txt"

function column
{   
    if [ -s $FICHERO ]; then
      echo "Print the column number you are looking for: "
      read -p "column: " column
      col="${column#-}"
      echo "You are working with column number:" $col
    else
      echo "El fichero no existe o no contiene entradas en la agenda"
    fi

    if [ "$col" = "" ]; then
      echo "Insert Columns please!"
    else
      echo
      let col+=1 # Add 1 to column name as we assume that 1 will be the first column of data
      awk -v "COL=$col" 'BEGIN{min=999;sum=0}
      { if(NR <= 1){ next }     # Skip first line which is the column name
        if ($COL<min){min=$COL} # Store minimum so far
        if ($COL>max){max=$COL}  # Store maximum so far
        sum+=$COL; }            # Store sum of the column
        END { print "minim columna="min;
              print "maxim columna="max;
          print "average columna="sum/(NR-1);}' $FICHERO;
     fi
}

column

This will print the information for one column in the file, which has a format as you described. From the code you posted it was unclear whether you wanted to be able to handle multiple columns in the same input; if so I'll leave it as an exercise for yourself.

回答2:

There is a simpler way. Forget about the entire program and just focus on the problem at hand. How would you generally get a column to print? Given your file.

GENE  RESULT1 RESULT2 RESULT3
GENE1    1       6       9
GENE2    2       6       7
GENE3    2       4       9
GENE4    1       6       9

GOAL: Obtain RESULT values by positional number (1, 2, 3, etc..). INPUT: A positive integer >= 1, yet <= 3 {x| 1 <= x <= 3} OUTPUT: A column of numbers.

Implicit truths: Column 1 ($1) is for labels and no returned column should ever have header information in it. That makes the following the real playing field:

1       6       9
2       6       7
2       4       9
1       6       9

Solution:

User input.

The first this section of your program should do make it possible to use the number 1. Right now, $1 would refer to the column of GENE lables. Thus, when any value is entered, add 1 to it so that you always get the column that you really want. Thus, zero and negative numbers are not acceptable replies from the user. Also, there are only three total columns in play. Thus, the number 3 is the largest number your program should handle (right now).
The second thing you should consider is using a command that will print all lines except the first line. The head and tail commands have many options.

tail -n 4

This is yeilds ...

GENE1    1       6       9
GENE2    2       6       7
GENE3    2       4       9
GENE4    1       6       9

Thus, when the a command above is piped to awk

read $REPLY  (2)  *<----User inputs 2*

REPLY=$(($REPLY + 1))

tail -n 4 | awk "{print $REPLY}"       (where $REPLY really = 3)

6
6
4
6

Boom! The second column is printed. I think you may prefer integrating this solution into your program than others listed here.

来源：https://stackoverflow.com/questions/19608566/read-an-input-number-and-redirect-it-to-a-column-in-a-file

标签

Linux

bash

case

calculated-columns