问题
I am using WWW::Mechanize::Firefox along with MozRepl plugin in Firefox. The code works properly to fetch content from sites by sending them an HTTP GET request.
I am going through a list of URLs and sending an HTTP GET Request to each of them.
However, if the request hangs on a particular URL, it keeps waiting.
Please note that I am referring to cases where a part of the web page content is loaded while some of the content is still pending. It happens in cases where a web page loads a lot of content from third party sites and if one of the resources (an image for instance) could not be loaded, the browser keeps waiting for it.
I want the request to timeout after 'n' seconds so that I can read the next URL from the list and continue with the code execution.
In WWW::Mechanize perl module, the constructor supported the timeout option as shown below:
$mech=WWW::Mechanize->new(timeout => 10);
However, I could not find a similar option in the documentation for the Perl Module, WWW::Mechanize::Firefox here:
http://metacpan.org/pod/WWW::Mechanize::Firefox
I tried this:
$mech=WWW::Mechanize::Firefox->new(timeout => 10);
But I think it does not work as there are still some sites for which the request hangs.
回答1:
WWW::Mechanize::Firefox
uses Mozrepl to connect with Firefox browser, so you don't need to declare timeout parameter, because Firefox will wait for page load.
If you want to check if site is really fully loaded, you should check that the element that you want (e.g. div
) is present:
while (!$mech->is_visible( xpath => '//div[@class="myDivClassAtHtml"]')) {
sleep 1;
};
# do something with your page
来源:https://stackoverflow.com/questions/22311475/perl-wwwmechanizefirefox-timeout-implementation