I wonder if there is any good PHP script (libraries) to check if link are broken? I have links to documents in a mysql table and could possibly just check if the link leads
You can do this in few ways:
First way - curl
function url_exists($url) {
$ch = @curl_init($url);
@curl_setopt($ch, CURLOPT_HEADER, TRUE);
@curl_setopt($ch, CURLOPT_NOBODY, TRUE);
@curl_setopt($ch, CURLOPT_FOLLOWLOCATION, FALSE);
@curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE);
$status = array();
preg_match('/HTTP\/.* ([0-9]+) .*/', @curl_exec($ch) , $status);
return ($status[1] == 200);
}
Second way - if you dont have curl installed - get headers
function url_exists($url) {
$h = get_headers($url);
$status = array();
preg_match('/HTTP\/.* ([0-9]+) .*/', $h[0] , $status);
return ($status[1] == 200);
}
Third way - fopen
function url_exists($url){
$open = @fopen($url,'r');
if($handle !== false){
return true;
}else{
return false;
}
}
First & second solutions
Try this:
$url = '[your_url]';
$curl = curl_init();
curl_setopt($curl, CURLOPT_URL, $url);
curl_setopt($curl, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($curl, CURLOPT_RETURNTRANSFER, true);
$result = curl_exec($curl);
if ($result === false) {
echo 'broken url';
} else {
$newUrl = curl_getinfo($curl, CURLINFO_EFFECTIVE_URL);
if ($newUrl !== $url) {
echo 'redirect to: ' . $newUrl;
}
}
curl_close($curl);
As quick workaround check, you can use the global variable $http_response_header with file_get_contents() function.
For example (extracted from PHP documentation):
<?php
function get_contents() {
file_get_contents("http://example.com");
var_dump($http_response_header);
}
get_contents();
var_dump($http_response_header);
Then check the status code in first line for a "HTTP/1.1 200 OK" or other HTTP status codes.
You can check for broken link using this function:
function check_url($url) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch , CURLOPT_RETURNTRANSFER, 1);
$data = curl_exec($ch);
$headers = curl_getinfo($ch);
curl_close($ch);
return $headers['http_code'];
}
You need to have CURL
installed for this to work. Now you can check for broken links using:
$check_url_status = check_url($url);
if ($check_url_status == '200')
echo "Link Works";
else
echo "Broken Link";
Also check this link for HTTP status codes : HTTP Status Codes
I think you can also check for 301
and 302
status codes.
Also another method would be to use get_headers
function . But this works only if your PHP version is greater than 5 :
function check_url($url) {
$headers = @get_headers( $url);
$headers = (is_array($headers)) ? implode( "\n ", $headers) : $headers;
return (bool)preg_match('#^HTTP/.*\s+[(200|301|302)]+\s#i', $headers);
}
In this case just check the output :
if (check_url($url))
echo "Link Works";
else
echo "Broken Link";
Hope this helps you :).