What reasons are there to prefer glob over readdir (or vice-versa) in Perl?

前端未结

关注

 10  627

This question is a spin-off from this one. Some history: when I first learned Perl, I pretty much always used glob rather than opendir + read


                      
              相关标签:


      
      
        
          10条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  别跟我提以往        
                
              
                            
                2020-12-01 03:01
              
            
            
                                                                       

  glob pros:
  
  3) No need to prepend the directory name onto items manually


Exception:

say for glob "*";

--output:--
1perl.pl
2perl.pl
2perl.pl.bak
3perl.pl
3perl.pl.bak
4perl.pl
data.txt
data1.txt
data2.txt
data2.txt.out


As far as I can tell, the rule for glob is: you must provide a full path to the directory to get full paths back.  The Perl docs do not seem to mention that, and neither do any of the posts here. 

That means that glob can be used in place of readdir when you want just filenames (rather than full paths), and you don't want hidden files returned, i.e. ones starting with '.'.  For example, 

chdir ("../..");  
say for glob("*");

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  栀梦        
                
              
                            
                2020-12-01 03:02
              
            
            
                                                                       
That was a pretty comprehensive list. readdir (and readdir + grep) has less overhead than glob and so that is a plus for readdir if you need to analyze lots and lots of directories.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  面向向阳花        
                
              
                            
                2020-12-01 03:03
              
            
            
                                                                       
You missed the most important, biggest difference between them: glob gives you back a list, but opendir gives you a directory handle. You can pass that directory handle around to let other objects or subroutines use it. With the directory handle, the subroutine or object doesn't have to know anything about where it came from, who else is using it, and so on:

 sub use_any_dir_handle {
      my( $dh ) = @_;
      rewinddir $dh;
      ...do some filtering...
      return \@files;
      }


With the dirhandle, you have a controllable iterator where you can move around with seekdir, although with glob you just get the next item.

As with anything though, the costs and benefits only make sense when applied to a certain context. They do not exist outside of a particular use. You have an excellent list of their differences, but I wouldn't classify those differences without knowing what you were trying to do with them.

Some other things to remember:


You can implement your own glob with opendir, but not the other way around.
glob uses its own wildcard syntax, and that's all you get. 
glob can return filenames that don't exist:

$ perl -le 'print glob "{ab}{cd}"'


                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  盖世英雄少女心        
                
              
                            
                2020-12-01 03:05
              
            
            
                                                                       
On a similar note, File::Slurp has a function called read_dir. 

Since I use File::Slurp's other functions a lot in my scripts, read_dir has also become a habit. 

It also has following options: err_mode, prefix, and keep_dot_dot. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  野趣味        
                
              
                            
                2020-12-01 03:09
              
            
            
                                                                       
Here is a disadvantage for opendir and readdir.

{
  open my $file, '>', 0;
  print {$file} 'Breaks while( readdir ){ ... }'
}
opendir my $dir, '.';

my $a = 0;
++$a for readdir $dir;
print $a, "\n";

rewinddir $dir;

my $b = 0;
++$b while readdir $dir;
print $b, "\n";


You would expect that code would print the same number twice, but it doesn't because there is a file with the name of 0. On my computer it prints 251, and 188, tested with Perl v5.10.0 and v5.10.1

This problem also makes it so that this just prints out a bunch of empty lines, regardless of the existence of file 0:

use 5.10.0;
opendir my $dir, '.';

say while readdir $dir;




Where as this always works just fine:

use 5.10.0;
my $a = 0;
++$a for glob '*';
say $a;

my $b = 0;
++$b while glob '*';
say $b;

say for glob '*';
say while glob '*';




I fixed these issues, and sent in a patch which made it into Perl v5.11.2, so this will work properly with Perl v5.12.0 when it comes out.

My fix converts this:

while( readdir $dir ){ ... }


into this:

while( defined( $_ = readdir $dir ){ ...}


Which makes it work the same way that read has worked on files. Actually it is the same bit of code, I just added another element to the corresponding if statements.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  [愿得一人]        
                
              
                            
                2020-12-01 03:18
              
            
            
                                                                       
Well, you pretty much cover it.  All that taken into account, I would tend to use glob when I'm throwing together a quick one-off script and its behavior is just what I want, and use opendir and readdir in ongoing production code or libraries where I can take my time and clearer, cleaner code is helpful.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复