So I am using WWW::Mechanize
to crawl sites. It works great, except if I request a url such as:
http://www.levi.com/
I am redirected to:
http://us.levi.com/home/index.jsp
And for my script I need to know that this redirect took place and what the url I was redirected to is. Is there anyway to detect this with WWW::Mechanize
or LWP
and then get the redirected url? Thanks!
use strict;
use warnings;
use URI;
use WWW::Mechanize;
my $url = 'http://...';
my $mech = WWW::Mechanize->new(autocheck => 0);
$mech->max_redirect(0);
$mech->get($url);
my $status = $mech->status();
if (($status >= 300) && ($status < 400)) {
my $location = $mech->response()->header('Location');
if (defined $location) {
print "Redirected to $location\n";
$mech->get(URI->new_abs($location, $mech->base()));
}
}
If the status code is 3XX, then you should check response headers for redirection url.
You can also get to the same place by inspecting the redirects()
method on the response object.
use strict;
use warnings;
use feature qw( say );
use WWW::Mechanize;
my $ua = WWW::Mechanize->new;
my $res = $ua->get('http://metacpan.org');
my @redirects = $res->redirects;
say 'request uri: ' . $redirects[-1]->request->uri;
say 'location header: ' . $redirects[-1]->header('Location');
Prints:
request uri: http://metacpan.org
location header: https://metacpan.org/
See https://metacpan.org/pod/HTTP::Response#$r-%3Eredirects Keep in mind that more than one redirect may have taken you to your current location. So you may want to inspect every response which is returned via redirects()
.
来源:https://stackoverflow.com/questions/10922054/perl-wwwmechanize-or-lwp-get-redirect-url