I want to get, requested website\'s favicon with PHP. I have been recommended using Google\'s favicon service but it is not functional. I want to do something on my own but
I've been doing something similar and I checked this with a bunch of URL and all seemed to work. URL doesn't have to be a base URL
function getFavicon($url){
# make the URL simpler
$elems = parse_url($url);
$url = $elems['scheme'].'://'.$elems['host'];
# load site
$output = file_get_contents($url);
# look for the shortcut icon inside the loaded page
$regex_pattern = "/rel=\"shortcut icon\" (?:href=[\'\"]([^\'\"]+)[\'\"])?/";
preg_match_all($regex_pattern, $output, $matches);
if(isset($matches[1][0])){
$favicon = $matches[1][0];
# check if absolute url or relative path
$favicon_elems = parse_url($favicon);
# if relative
if(!isset($favicon_elems['host'])){
$favicon = $url . '/' . $favicon;
}
return $favicon;
}
return false;
}
If you want to retrieve the favicon from a particular website, you simply need to fetch favicon.ico
from the root of their website. Like so:
$domain = "www.example.com";
$url = "http://".$domain."/favicon.ico";
$icondata = file_get_contents($url);
... you can now do what you like with the icon data
According to Wikipedia, there are 2 major methods which can be used by websites to have a favicon picked up by a browser. The first is as Steve mentioned, having the icon stored as favicon.ico in the root directory of the webserver. The second is to reference the favicon via the HTML link tag.
To cover all of these cases, the best idea would be to test for the presence of the favicon.ico file first, and if it is not present, search for either the <link rel="icon"
or <link rel="shortcut icon"
part in the source (limited to the HTML head node) until you find the favicon. It is up to you whether you choose to use regex, or some other string search option (not to mention the built in PHP ones). Finally, this question may be of some help to you.
It looks like http://www.getfavicon.org/?url=domain.com
(FAQ) reliably scrapes a website's favicon. I realise it's a 3rd-party service but I think it's a worthy alternative to the Google favicon service.
Use the S2 service
provided by google. It is as simple as this
http://www.google.com/s2/favicons?domain=www.yourdomain.com
Scraping this would be much easier, that trying to do it yourself.
Found this thread... I have written a WordPress plugin that encompasses a lot of variations on retrieving the favicon. Since there are a lot the GPL code: http://plugins.svn.wordpress.org/wp-favicons/trunk/
It lets you run a server which you can request icons from via xml rpc requests so any client can request icons. It does have a plugin structure so you can try google, getfavicon, etc... to see if one of these services delivers anything. If not then it goes into a icon fetching mode taking into account all http statusses (301/302/404) and does it best to find an icon anywhere. After this it uses image library functions to check inside the file if it is really an image and what kind of image (sometimes the extension is wrong) and it is pluggable so you can add after image conversions or extra functionality in the pipeline.
the http fetching file does some logic around what i see above: http://plugins.svn.wordpress.org/wp-favicons/trunk/includes/server/class-http.php
but it is only part of the pipeline.
can get pretty complex once you dive into it.