How do I download a file with WWW::Mechanize after it submits a form?

后端未结

关注

 3  905

I have the code:

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = \'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292\';
my $m = WWW:


                      
              相关标签:


      
      
        
          3条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2021-01-19 04:59
              
            
            
                                                                       
After submitting the form, you can use:


  $mech->save_content( $filename )
  
  Dumps the contents of $mech->content into $filename. $filename will be
  overwritten. Dies if there are any errors.
  
  If the content type does not begin with "text/", then the content is
  saved in binary mode.


Source: http://metacpan.org/pod/WWW::Mechanize
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  温柔的废话        
                
              
                            
                2021-01-19 05:18
              
            
            
                                                                       
I tried your code and it returns a stack of HTML of which the only http:// references were:
    http://www.w3c.org
    http://ad.z5x.net
    http://divxsubtitles.net
    http://feeds2read.net
    http://ad.z5x.net
    http://www.google-analytics.com
    http://cls.assoc-amazon.com

using the code


    my $content = $m->response->content();
    while ( $content =~ m{(http://[^/\" \t\n\r]+)}g ) {
        print( "$1\n" );
    }


So my comments to you are:

1. add use strict; to your code, you are programming for failure if you don't

2. read the output HTML and determine what to do next, you haven't done that, and therefore you've asked an incomplete question. Unless you identify the URL you want to download you are asking somebody else to write a program for you.

Once you've identified the URL you want to download it is a simple matter of getting it and then writing the response content to a file. e.g.


if ( ! open( FOUT, ">output.bin" ) ) {
    die( "Could not create file: $!" );
}
binmode( FOUT ); # required for Windows
print( FOUT $m->response->content() );
close( FOUT );

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  南笙        
                
              
                            
                2021-01-19 05:20
              
            
            
                                                                       
Well the thing that threw me off the most was the "mechanize->form_number" subroutine starts at 1 whereas typical programs start their index at 0. If anyone wants to know how to download response headers, or download header attachments, this is the way to do it.

Now here's the full code to do what I wanted.

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = 'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292';
my $m = WWW::Mechanize->new(autocheck => 1);
$m->get($url);
$m->form_number(2);
$m->click();
my $response = $m->res();
my $filename = $response->filename;

if (! open ( FOUT, ">$filename" ) ) {
    die("Could not create file: $!" );
}
print( FOUT $m->response->content() );
close( FOUT );

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复