Perl WWW::Mechanize::Firefox timeout implementation

萝らか妹 提交于 2019-12-08 03:58:08

问题


I am using WWW::Mechanize::Firefox along with MozRepl plugin in Firefox. The code works properly to fetch content from sites by sending them an HTTP GET request.

I am going through a list of URLs and sending an HTTP GET Request to each of them.

However, if the request hangs on a particular URL, it keeps waiting.

Please note that I am referring to cases where a part of the web page content is loaded while some of the content is still pending. It happens in cases where a web page loads a lot of content from third party sites and if one of the resources (an image for instance) could not be loaded, the browser keeps waiting for it.

I want the request to timeout after 'n' seconds so that I can read the next URL from the list and continue with the code execution.

In WWW::Mechanize perl module, the constructor supported the timeout option as shown below:

$mech=WWW::Mechanize->new(timeout => 10);

However, I could not find a similar option in the documentation for the Perl Module, WWW::Mechanize::Firefox here:

http://metacpan.org/pod/WWW::Mechanize::Firefox

I tried this:

$mech=WWW::Mechanize::Firefox->new(timeout => 10);

But I think it does not work as there are still some sites for which the request hangs.


回答1:


WWW::Mechanize::Firefox uses Mozrepl to connect with Firefox browser, so you don't need to declare timeout parameter, because Firefox will wait for page load.

If you want to check if site is really fully loaded, you should check that the element that you want (e.g. div) is present:

while (!$mech->is_visible( xpath => '//div[@class="myDivClassAtHtml"]')) {
          sleep 1;
}; 
# do something with your page


来源:https://stackoverflow.com/questions/22311475/perl-wwwmechanizefirefox-timeout-implementation

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!