I cant seem to get this peice of code to work:
$self->{_current_page} = $href;
my $response = $ua->get($href);
my $responseCode = $response->
Check your SELinux settings.
SELINUX enabled systems will not allow an outgoing connection from a web agent (httpd).
This page can tell you more about SELinux and HTTPD settings: http://wiki.centos.org/TipsAndTricks/SelinuxBooleans
Enable outbound web connections from Apache in a Perl script:
# setsebool -P httpd_can_network_connect on
Here's your problem:
my $content = LWP::Simple->get($href);
That passes the string "LWP::Simple" as the first argument to 'get'. You want:
my $content = LWP::Simple::get($href);
You should examine the response code to see what's happening (you're already checking for 404s). I get a 302 - a redirect.
For example:
die "get failed ($responseCode): " . $href if (!defined $content);
Resulting message:
get failed (302): http://www.google.com at goog.pl line 20.
A couple of thoughts.
1/ You seems to be using the string comparison operators (le, ne) to compare numbers. You should use the numeric comparison operators (<=, !=) instead.
2/ The value you get back from the LWP::UserAgent::get call is an HTTP::Response object. Judicious use of that class's "is_foo" method might make your code a bit cleaner.
I don't know if either of these will solve your problem. But they'll improve the quality of your code.