This is the files I am reading,
#Log1
Time Src_id Des_id Address
0 34 56 x9870
2 36 58 x9872
4 38 60 x9874
6 40 62 x9876
8 42 64 x9878
Because @fields*
gets overwritten during each loop. You need this:
while(my $line = <IN1>){
my @tmp = split(" ", $line);
push(@fields1, \@tmp);
}
foreach $item (@fields1){
print("@{$item}\n");
}
Then @fields1
contains references pointing to the split
ed array.
The final @fields1
looks like:
@fields1 = (
<ref> ----> ["0", "34", "56", "x9870"]
<ref> ----> ["2", "36", "58", "x9872"]
...
)
The print
will print:
Time Src_id Des_id Address
0 34 56 x9870
2 36 58 x9872
4 38 60 x9874
6 40 62 x9876
8 42 64 x9878
And I guess it would be better if you do chomp($line)
.
But I'd like to simply do push(@fields1, $line)
. And split
each array item when in comparison stage.
To compare the content of 2 files, I personally would use 2 while
loops to read into 2 arrays just like what you have done. Then do the comparison in one for
or foreach
.
Following code demonstrates how to read and print log files (OP does not specify why he splits lines into fields)
use strict;
use warnings;
use feature 'say';
my $fname1 = 'log1.txt';
my $fname2 = 'log2.txt';
my $div = "\t";
my $file1 = read_file($fname1);
my $file2 = read_file($fname2);
print_file($file1,$div);
print_file($file2,$div);
sub read_file {
my $fname = shift;
my @data;
open my $fh, '<', $fname
or die "Couldn't read $fname";
while( <$fh> ) {
chomp;
next if /^#Log/;
push @data, [split];
}
close $fh;
return \@data;
}
sub print_file {
my $data = shift;
my $div = shift;
say join($div,@{$_}) for @{$data};
}
Output
Time Src_id Des_id Address
0 34 56 x9870
2 36 58 x9872
4 38 60 x9874
6 40 62 x9876
8 42 64 x9878
Time Src_id Des_id Address
1 35 57 x9871
3 37 59 x9873
5 39 61 x9875
7 41 63 x9877
9 43 65 x9879
Let's assume that OP wants to merge two files into one with sorted lines on Time
field
%data
hash with Time
field as key@fields
)Time
keyuse strict;
use warnings;
use feature 'say';
my(@fields,%data);
my $fname1 = 'log1.txt';
my $fname2 = 'log2.txt';
read_data($fname1);
read_data($fname2);
say join("\t",@fields);
say join("\t",@{$data{$_}}) for sort { $a <=> $b } keys %data;
sub read_data {
my $fname = shift;
open my $fh, '<', $fname
or die "Couldn't open $fname";
while( <$fh> ) {
next if /^#Log/;
if( /^Time/ ) {
@fields = split;
} else {
my @line = split;
$data{$line[0]} = \@line;
}
}
close $fh;
}
Output
Time Src_id Des_id Address
0 34 56 x9870
1 35 57 x9871
2 36 58 x9872
3 37 59 x9873
4 38 60 x9874
5 39 61 x9875
6 40 62 x9876
7 41 63 x9877
8 42 64 x9878
9 43 65 x9879
You can merge the log files using paste, and read the resulting merged file one line at a time. This is more elegant and saves RAM. Here is an example of a possible comparison of time1
and time2
, writing STDOUT and STDERR into separate files. The example prints into STDOUT all the input fields if time1 < time2 and time1 < 4
, otherwise prints a warning into STDERR:
cat > log1.log <<EOF
Time Src_id Des_id Address
0 34 56 x9870
2 36 58 x9872
4 38 60 x9874
6 40 62 x9876
8 42 64 x9878
EOF
cat > log2.log <<EOF
Time Src_id Des_id Address
1 35 57 x9871
3 37 59 x9873
5 39 61 x9875
7 41 63 x9877
9 43 65 x9879
EOF
# Paste files side by side, skip header, read data lines together, compare and print:
paste log1.log log2.log | \
tail -n +2 | \
perl -lane '
BEGIN {
for $file_num (1, 2) { push @col_names, map { "$_$file_num" } qw( time src_id des_id address ) }
}
my %val;
@val{ @col_names } = @F;
if ( $val{time1} < $val{time2} and $val{time1} < 4) {
print join "\t", @val{ @col_names};
} else {
warn "not found: @val{ qw( time1 time2 ) }";
}
' 1>out.tsv 2>out.log
Output:
% cat out.tsv
0 34 56 x9870 1 35 57 x9871
2 36 58 x9872 3 37 59 x9873
% cat out.log
not found: 4 5 at -e line 10, <> line 3.
not found: 6 7 at -e line 10, <> line 4.
not found: 8 9 at -e line 10, <> line 5.
The Perl one-liner uses these command line flags:
-e
: Tells Perl to look for code in-line, instead of in a file.
-n
: Loop over the input one line at a time, assigning it to $_
by default.
-l
: Strip the input line separator ("\n"
on *NIX by default) before executing the code in-line, and append it when printing.
-a
: Split $_
into array @F
on whitespace or on the regex specified in -F
option.
SEE ALSO:
perldoc perlrun: how to execute the Perl interpreter: command line switches