I have some domains I want to split but can\'t figure out the regex...
I have:
http://www.google.com/tomato
http://int.google.c
why you trying to use regex ? there's plenty of native functions available for you, such as:
$host = parse_url($url, PHP_URL_HOST);
update, give this a go, it may need improving but its better than Regex imo
function determainDomainName($url)
{
$hostname = parse_url($url, PHP_URL_HOST);
$parts = explode(".",$hostname);
switch(count($parts))
{
case 1:
return $parts[0]; //has to be a .com etc
break;
case 2:
if($parts[1] == "www") //The most common subdomain
{
return $parts[2]; //Bypass Subdomain / return next segment
}
if($parts[2] == "co") //Possible in_array here for multiples, but first segment of double barrel tld
{
return $parts[1]; //Bypass double barrel tld's
}
break;
default:
//Have a guess
//I bet the longest word is the domain :)
usort($parts,"mysort");
return $parts[0];
/*
here we just order the array by the longest word
so google will always come above the following
com,co,uk,www,cdn,ww1,ww2 etc
*/
break;
}
}
function mysort($a,$b){
return strlen($b) - strlen($a);
}
Add the following 2 functions to your libraries etc.
Then use like so:
$urls = array(
'http://www.google.com/tomato',
'http://int.google.com',
'http://google.co.uk'
);
foreach($urls as $url)
{
echo determainDomainName($url) . "\n";
}
They will all echo google
see @ http://codepad.org/pA5KWckb