How can one check to see if a remote file exists using PHP?

前端 未结 22 2585
死守一世寂寞
死守一世寂寞 2020-11-22 05:52

The best I could find, an if fclose fopen type thing, makes the page load really slowly.

Basically what I\'m trying to do is

相关标签:
22条回答
  • 2020-11-22 06:18

    As Pies say you can use cURL. You can get cURL to only give you the headers, and not the body, which might make it faster. A bad domain could always take a while because you will be waiting for the request to time-out; you could probably change the timeout length using cURL.

    Here is example:

    function remoteFileExists($url) {
        $curl = curl_init($url);
    
        //don't fetch the actual page, you only want to check the connection is ok
        curl_setopt($curl, CURLOPT_NOBODY, true);
    
        //do request
        $result = curl_exec($curl);
    
        $ret = false;
    
        //if request did not fail
        if ($result !== false) {
            //if request was ok, check response code
            $statusCode = curl_getinfo($curl, CURLINFO_HTTP_CODE);  
    
            if ($statusCode == 200) {
                $ret = true;   
            }
        }
    
        curl_close($curl);
    
        return $ret;
    }
    
    $exists = remoteFileExists('http://stackoverflow.com/favicon.ico');
    if ($exists) {
        echo 'file exists';
    } else {
        echo 'file does not exist';   
    }
    
    0 讨论(0)
  • 2020-11-22 06:19

    This is not an answer to your original question, but a better way of doing what you're trying to do:

    Instead of actually trying to get the site's favicon directly (which is a royal pain given it could be /favicon.png, /favicon.ico, /favicon.gif, or even /path/to/favicon.png), use google:

    <img src="http://www.google.com/s2/favicons?domain=[domain]">
    

    Done.

    0 讨论(0)
  • 2020-11-22 06:22

    If the file is not hosted external you might translate the remote URL to an absolute Path on your webserver. That way you don't have to call CURL or file_get_contents, etc.

    function remoteFileExists($url) {
    
        $root = realpath($_SERVER["DOCUMENT_ROOT"]);
        $urlParts = parse_url( $url );
    
        if ( !isset( $urlParts['path'] ) )
            return false;
    
        if ( is_file( $root . $urlParts['path'] ) )
            return true;
        else
            return false;
    
    }
    
    remoteFileExists( 'https://www.yourdomain.com/path/to/remote/image.png' );
    

    Note: Your webserver must populate DOCUMENT_ROOT to use this function

    0 讨论(0)
  • 2020-11-22 06:24

    If you are dealing with images, use getimagesize. Unlike file_exists, this built-in function supports remote files. It will return an array that contains the image information (width, height, type..etc). All you have to do is to check the first element in the array (the width). use print_r to output the content of the array

    $imageArray = getimagesize("http://www.example.com/image.jpg");
    if($imageArray[0])
    {
        echo "it's an image and here is the image's info<br>";
        print_r($imageArray);
    }
    else
    {
        echo "invalid image";
    }
    
    0 讨论(0)
  • 2020-11-22 06:25

    You can instruct curl to use the HTTP HEAD method via CURLOPT_NOBODY.

    More or less

    $ch = curl_init("http://www.example.com/favicon.ico");
    
    curl_setopt($ch, CURLOPT_NOBODY, true);
    curl_exec($ch);
    $retcode = curl_getinfo($ch, CURLINFO_HTTP_CODE);
    // $retcode >= 400 -> not found, $retcode = 200, found.
    curl_close($ch);
    

    Anyway, you only save the cost of the HTTP transfer, not the TCP connection establishment and closing. And being favicons small, you might not see much improvement.

    Caching the result locally seems a good idea if it turns out to be too slow. HEAD checks the time of the file, and returns it in the headers. You can do like browsers and get the CURLINFO_FILETIME of the icon. In your cache you can store the URL => [ favicon, timestamp ]. You can then compare the timestamp and reload the favicon.

    0 讨论(0)
  • 2020-11-22 06:26

    You should issue HEAD requests, not GET one, because you don't need the URI contents at all. As Pies said above, you should check for status code (in 200-299 ranges, and you may optionally follow 3xx redirects).

    The answers question contain a lot of code examples which may be helpful: PHP / Curl: HEAD Request takes a long time on some sites

    0 讨论(0)
提交回复
热议问题