How do I average column values from a tab-separated data file, ignoring a header row and the left column?

守給你的承諾、 提交于 2019-11-28 08:56:50

问题


My task is to compute averages from the following data file, titled Lab1_table.txt:

retrovirus      genome  gag     pol     env
HIV-1           9181    1503    3006    2571
FIV             9474    1353    2993    2571
KoRV            8431    1566    3384    1980
GaLV            8088    1563    3498    2058
PERV            8072    1560    3621    1532

I have to write a script that will open and read this file, read each line by splitting the contents into an array and computer the average of the numerical values (genome, gag, pol, env), and write to a new file the average from each of the aforementioned columns.

I've been trying my best to figure out how to not take into account the first row, or the first column, but every time I try to execute on the command line I keep coming up with 'explicit package name' errors.

Global symbol @average requires explicit package name at line 23.
Global symbol @average requires explicit package name at line 29.
Execution aborted due to compilation errors.

I understand that this involves @ and $, but even knowing that I've not been able to change the errors.

This is my code, but I emphasise that I'm a beginner having started this just last week:

#!/usr/bin/perl -w
use strict;

my $infile = "Lab1_table.txt"; # This is the file path
open INFILE, $infile or die "Can't open $infile: $!";

my $count = 0;
my $average = ();

while (<INFILE>) {
    chomp;
    my @columns = split /\t/;
    $count++;
    if ( $count == 1 ) {
        $average = @columns;
    }
    else {
        for( my $i = 1; $i < scalar $average; $i++ )  {
            $average[$i] += $columns[$i];
        }
    }
}

for( my $i = 1; $i < scalar $average; $i++ ) {
    print $average[$i]/$count, "\n";
}

I'd appreciate any insight, and I would also great appreciate letting me know by list numbering what you're doing at each step - if appropriate. I'd like to learn and it would make more sense to me if I was able to read through what someone's process was.


回答1:


Here are the points you need to change
Use another variable for the headers

my $count = 0;
my @header = ();
my @average = ();

then change the logic inside if statement

if ( $count == 1 ) {
    @header = @columns;
}

Now don't use the @average for the limit, use $i < scalar @columns for else statement. Initially @average is zero, you will never get inside the for loop ever.

else {
    for( my $i = 1; $i < scalar @columns; $i++ )  {
        $average[$i] += $columns[$i];
    }
}

Finally add -1 to your counter. Remember you increment your counter when you parse your header

for( my $i = 1; $i < scalar @average; $i++ ) {
    print $average[$i]/($count-1), "\n";
}

Here is the final code
You can take advantage of @header to display the result neatly

#!/usr/bin/perl -w

use strict;

my $infile = "Lab1_table.txt"; # This is the file path
open INFILE, $infile or die "Can't open $infile: $!"; 

my $count = 0;
my @header = ();
my @average = ();

while (<INFILE>) {
    chomp;


    my @columns = split /\t/;
    $count++;
    if ( $count == 1 ) {
        @header = @columns;
    }
    else {
        for( my $i = 1; $i < scalar @columns; $i++ )  {
            $average[$i] += $columns[$i];
        }
    }
} 

for( my $i = 1; $i < scalar @average; $i++ ) {
    print $average[$i]/($count-1), "\n";
}

There are other ways to write this code but I thought it would be better to just correct your code so that you can easily understand what is wrong with your code. Hope it helps



来源:https://stackoverflow.com/questions/9677533/how-do-i-average-column-values-from-a-tab-separated-data-file-ignoring-a-header

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!