问题
The following excerpt code, when running on perl 5.16.3 and older versions, has a strange behavior, where subsequent calls to a glob in the line input operator causes the glob to continue returning previous values, rather than running the glob anew.
#!/usr/bin/env perl
use strict;
use warnings;
my @dirs = ("/tmp/foo", "/tmp/bar");
foreach my $dir (@dirs) {
my $count = 0;
my $glob = "*";
print "Processing $glob in $dir\n";
while (<$dir/$glob>) {
print "Processing file $_\n";
$count++;
last if $count > 0;
}
}
If you put two files in /tmp/foo and one or more in /tmp/bar, and run the code, I get the following output:
Processing * in /tmp/foo
Processing file /tmp/foo/foo.1
Processing * in /tmp/bar
Processing file /tmp/foo/foo.2
I thought that when the while
terminates after the last
, that the new invocation of the while
on the second iteration would re-run the glob and give me the files listed /tmp/bar, but instead I get a continuation of what's in /tmp/foo.
It's almost like the angle operator glob is acting like a precompiled pattern. My hypothesis is that the angle operator is creating a filehandle in the symbol table that's still open and being reused behind the scenes, and that it's scoped to the containing foreach
, or possibly the whole subroutine.
回答1:
From I/O Operators in perlop (my emphasis)
A (file)glob evaluates its (embedded) argument only when it is starting a new list. All values must be read before it will start over. In list context, this isn't important because you automatically get them all anyway. However, in scalar context the operator returns the next value each time it's called, or
undef
when the list has run out.
Since <>
is called in scalar context here and you exit the loop with last
after the first iteration, the next time you enter it it keeps reading from the original list.
It is clarified in comments that there is a practical need behind this quest: process only some of the files from a directory and never return all filenames since there can be many.
So assigning from glob
to a list and working with it, or better yet using for
instead of while
as commented by ysth, doesn't help here as it returns a huge list.
I haven't found a way to make glob
(what <>
with a filename pattern uses) drop and rebuild the list once it's generated it, without getting to its end first.
Apparently, each instance of the operator gets its own list. So using another <>
inside the while
loop with the hope of resetting it, in any way and even with the same pattern, doesn't affect the list being iterated over in while (<$glob>)
.
Just to note, breaking out of the loop with a die
(with while
in an eval
) doesn't help either; the next time we come to that while
the same list is continued. Wrapping it in a closure
sub iter_glob { my $dir = shift; return sub { scalar <"$dir/*"> } }
for my $d (@dirs) {
my $iter = iter_glob($d);
while (my $f = $iter->()) {
# ...
}
}
met with the same fate; the original list keeps being used.
The solution then is to use readdir
instead.
来源:https://stackoverflow.com/questions/44856014/line-input-operator-with-glob-returning-old-values