I need to implement functions to check whether paths and urls are relative, absolute, or invalid (invalid syntactically- not whether resource exists). What are the range of
I've recently started a composer package that might be useful for checking wether URL's are relative / absolute (and more, ofcourse).
Check out the repository here: https://github.com/Enrise/UriHelper Or the composer Packagists package here: https://packagist.org/packages/enrise/urihelper
Some examples:
$uri = new \Enrise\Uri('http://usr:pss@example.com:81/mypath/myfile.html?a=b&b[]=2&b[]=3#myfragment');
echo $uri->getScheme(); // http
echo $uri->getUser(); // usr
echo $uri->getPass(); // pss
echo $uri->getHost(); // example.com
echo $uri->getPort(); // 81
echo $uri->getPath(); // /mypath/myfile.html
echo $uri->getQuery(); // a=b&b[]=2&b[]=3
echo $uri->getFragment(); // myfragment
echo $uri->isSchemeless(); // false
echo $uri->isRelative(); // false
$uri->setScheme('scheme:child:scheme.VALIDscheme123:');
$uri->setPort(null);
echo $uri->getUri(); //scheme:child:scheme.VALIDscheme123:usr:pss@example.com/mypath/myfile.html?a=b&b[]=2&b[]=3#myfragment
I think best is
$test = [
'/link?param=1'=>parse_url('/assa?ass'),
'//aaa.com/link?param=1'=>parse_url('//assa?ass'),
'http://aaa.com/link?param=1'=>parse_url('http://as.plassa?ass')
];
Absolute Paths and URLs
You are correct, absolute URLs in Linux must start with /
, so checking for a slash in the start of the path will be enough.
For URLs you need to check for http://
and https://
, as you wrote, however, there are more URLs starting with ftp://
, sftp://
or smb://
. So it is very depending on what range of uses you want to cover.
Invalid Paths and URLs
Assuming you are referring to Linux, the only chars that are forbidden in a path are /
and \0
. This is actually very filesystem dependent, however, you can assume the above to be correct for most uses.
In Windows it is more complicated. You can read about it in the Path.GetInvalidPathChars Method documentation under Remarks.
URLs are more complicated than Linux paths as the only allowed chars are A-Z
, a-z
, 0-9
, -
, .
, _
, ~
, :
, /
, ?
, #
, [
, ]
, @
, !
, $
, &
, '
, (
, )
, *
, +
, ,
, ;
and =
(as described in another answer here).
Relative Paths and URLs
In general, paths and URLs which are neither absolute nor invalid are relative.
From Symfony FileSystem component to check if a path is absolute:
public function isAbsolutePath($file)
{
return strspn($file, '/\\', 0, 1)
|| (strlen($file) > 3 && ctype_alpha($file[0])
&& substr($file, 1, 1) === ':'
&& strspn($file, '/\\', 2, 1)
)
|| null !== parse_url($file, PHP_URL_SCHEME)
;
}
This function is taken from Drupal
public function is_absolute($url)
{
$pattern = "/^(?:ftp|https?|feed):\/\/(?:(?:(?:[\w\.\-\+!$&'\(\)*\+,;=]|%[0-9a-f]{2})+:)*
(?:[\w\.\-\+%!$&'\(\)*\+,;=]|%[0-9a-f]{2})+@)?(?:
(?:[a-z0-9\-\.]|%[0-9a-f]{2})+|(?:\[(?:[0-9a-f]{0,4}:)*(?:[0-9a-f]{0,4})\]))(?::[0-9]+)?(?:[\/|\?]
(?:[\w#!:\.\?\+=&@$'~*,;\/\(\)\[\]\-]|%[0-9a-f]{2})*)?$/xi";
return (bool) preg_match($pattern, $url);
}
If you already know that the URL is well formed:
if(strpos($uri,'://')!==false){
//protocol: absolute url
}elseif(substr($uri,0,1)!='/'){
//leading '/': absolute to domain name (half relative)
}else{
//no protocol and no leading slash: relative to this page
}