How do I open an array of files in Perl?

僤鯓⒐⒋嵵緔 提交于 2019-12-05 21:36:07

A lot of issues. Starting with call to "ls | grep" :)

Let's start with some code:

First, let's get list of files:

my @files = glob( '*.txt' );

But it would be better to test if the given name relates to file or directory:

my @files = grep { -f } glob( '*.txt' );

Now, let's open these files to read them:

my @fhs = map { open my $fh, '<', $_; $fh } @files;

But, we need a way to handle errors - in my opinion the best way is to add:

use autodie;

At the beginning of script (and installation of autodie, if you don't have it yet). Alternatively you can:

use Fatal qw( open );

Now, that we have it, let's get the first line (as you showed in your example) from all of the inputs, and concatenate it:

my $concatenated = '';

for my $fh ( @fhs ) {
    my $line = <$fh>;
    $concatenated .= $line;
}

Which is perfectly fine, and readable, but still can be shortened, while maintaining (in my opinion) readability, to:

my $concatenated = join '', map { scalar <$_> } @fhs;

Effect is the same - $concatenated contains first lines from all files.

So, whole program would look like this:

#!/usr/bin/perl
use strict;
use warnings;
use autodie;
# use Fatal qw( open ); # uncomment if you don't have autodie

my @files        = grep { -f } glob( '*.txt' );
my @fhs          = map { open my $fh, '<', $_; $fh } @files;
my $concatenated = join '', map { scalar <$_> } @fhs;

Now, it might be that you want to concatenate not just first lines, but all of them. In this situation, instead of $concatenated = ... code, you'd need something like this:

my $concatenated = '';

while (my $fh = shift @fhs) {
    my $line = <$fh>;
    if ( defined $line ) {
        push @fhs, $fh;
        $concatenated .= $line;
    } else {
        close $fh;
    }
}
Chris Lutz

Here is your problem:

for my $i (0..$#files) {
  my @blah = <$files[$i]>;
  $concat .= $blah;
}

First, <$files[$i]> isn't a valid filehandle read. This is the source of your GLOB(...) errors. See mobrule's answer for why this is the case. So change it to this:

for my $file (@files) {
  my @blah = <$file>;
  $concat .= $blah;
}

Second problem, You're mixing @blah (an array named blah) and $blah (a scalar named blah). This is the source of your "uninitialized value" errors - $blah (the scalar) hasn't been initialized, but you're using it. If you want the $n-th line from @blah, use this:

for my $file (@files) {
  my @blah = <$file>;
  $concat .= $blah[$n];
}

I don't want to keep beating a dead horse, but I do want to address a better way to do something:

my $text = `ls | grep ".txt"`;
my @temps = split(/\n/,$text);

This reads in a list of all files in the current directory that have a ".txt" extension in them. This works, and is effective, but it can be rather slow - we have to call out to the shell, which has to fork off to run ls and grep, and that incurs a bit of overhead. Furthermore, ls and grep are simple and common programs, but not exactly portable. Surely there's a better way to do this:

my @temps;
opendir(DIRHANDLE, ".");
while(my $file = readdir(DIRHANDLE)) {
  push @temps, $file if $file =~ /\.txt/;
}

Simple, short, pure Perl, no forking, no non-portable shells, and we don't have to read in the string and then split it - we can only store the entries we really need. Plus, it becomes trivial to modify the conditions for files that pass the test. Say we end up accidentally reading the file test.txt.gz because our regex matches: we can easily change that line to:

  push @temps, $file if $file =~ /\.txt$/;

We can do that one with grep (I believe), but why settle for grep's limited regular expressions when Perl has one of the most powerful regex libraries anywhere built-in?

Use braces around $files[$i] inside the <> operator

my @blah = <{$files[$i]}>

Otherwise Perl interprets <> as the file glob operator instead of the read-from-filehandle operator.

You've got some good answers already. Another way to tackle the problem is to create a list-of-lists containing all of the lines from the files (@content). Then use the each_arrayref function from List::MoreUtils, which will create an iterator that yields line 1 from all files, then line 2, etc.

use strict;
use warnings;
use List::MoreUtils qw(each_arrayref);

my @content =
    map {
        open(my $fh, '<', $_) or die $!;
        [<$fh>]
    }
    grep {-f}
    glob '*.txt'
;
my $iterator = each_arrayref @content;
while (my @nth_lines = $iterator->()){
    # Do stuff with @nth_lines;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!