Consider the following array:
/www/htdocs/1/sites/lib/abcdedd
/www/htdocs/1/sites/conf/xyz
/www/htdocs/1/sites/conf/abc/
Well, considering that you can use XOR
in this situation to find the common parts of the string. Any time you xor two bytes that are the same, you get a nullbyte as the output. So we can use that to our advantage:
$first = $array[0];
$length = strlen($first);
$count = count($array);
for ($i = 1; $i < $count; $i++) {
$length = min($length, strspn($array[$i] ^ $first, chr(0)));
}
After that single loop, the $length
variable will be equal to the longest common basepart between the array of strings. Then, we can extract the common part from the first element:
$common = substr($array[0], 0, $length);
And there you have it. As a function:
function commonPrefix(array $strings) {
$first = $strings[0];
$length = strlen($first);
$count = count($strings);
for ($i = 1; $i < $count; $i++) {
$length = min($length, strspn($strings[$i] ^ $first, chr(0)));
}
return substr($first, 0, $length);
}
Note that it does use more than one iteration, but those iterations are done in libraries, so in interpreted languages this will have a huge efficiency gain...
Now, if you want only full paths, we need to truncate to the last /
character. So:
$prefix = preg_replace('#/[^/]*$', '', commonPrefix($paths));
Now, it may overly cut two strings such as /foo/bar
and /foo/bar/baz
will be cut to /foo
. But short of adding another iteration round to determine if the next character is either /
or end-of-string, I can't see a way around that...
Ok, I'm not sure this is bullet-proof, but I think it works:
echo array_reduce($array, function($reducedValue, $arrayValue) {
if($reducedValue === NULL) return $arrayValue;
for($i = 0; $i < strlen($reducedValue); $i++) {
if(!isset($arrayValue[$i]) || $arrayValue[$i] !== $reducedValue[$i]) {
return substr($reducedValue, 0, $i);
}
}
return $reducedValue;
});
This will take the first value in the array as reference string. Then it will iterate over the reference string and compare each char with the char of the second string at the same position. If a char doesnt match, the reference string will be shortened to the position of the char and the next string is compared. The function will return the shortest matching string then.
Performance depends on the strings given. The earlier the reference string gets shorter, the quicker the code will finish. I really have no clue how to put that in a formula though.
I found that Artefacto's approach to sort the strings increases performance. Adding
asort($array);
$array = array(array_shift($array), array_pop($array));
before the array_reduce
will significantly increase performance.
Also note that this will return the longest matching initial substring, which is more versatile but wont give you the common path. You have to run
substr($result, 0, strrpos($result, '/'));
on the result. And then you can use the result to remove the values
print_r(array_map(function($v) use ($path){
return str_replace($path, '', $v);
}, $array));
which should give:
[0] => /lib/abcdedd
[1] => /conf/xyz/
[2] => /conf/abc/def
[3] => /htdocs/xyz
[4] => /lib2/abcdedd
Feedback welcome.
$arrMain = array(
'/www/htdocs/1/sites/lib/abcdedd',
'/www/htdocs/1/sites/conf/xyz',
'/www/htdocs/1/sites/conf/abc/def',
'/www/htdocs/1/sites/htdocs/xyz',
'/www/htdocs/1/sites/lib2/abcdedd'
);
function explodePath( $strPath ){
return explode("/", $strPath);
}
function removePath( $strPath)
{
global $strCommon;
return str_replace( $strCommon, '', $strPath );
}
$arrExplodedPaths = array_map( 'explodePath', $arrMain ) ;
//Check for common and skip first 1
$strCommon = '';
for( $i=1; $i< count( $arrExplodedPaths[0] ); $i++)
{
for( $j = 0; $j < count( $arrExplodedPaths); $j++ )
{
if( $arrExplodedPaths[0][ $i ] !== $arrExplodedPaths[ $j ][ $i ] )
{
break 2;
}
}
$strCommon .= '/'.$arrExplodedPaths[0][$i];
}
print_r( array_map( 'removePath', $arrMain ) );
This works fine... similar to mark baker but uses str_replace
I would explode
the values based on the / and then use array_intersect_assoc
to detect the common elements and ensure they have the correct corresponding index in the array. The resulting array could be recombined to produce the common path.
function getCommonPath($pathArray)
{
$pathElements = array();
foreach($pathArray as $path)
{
$pathElements[] = explode("/",$path);
}
$commonPath = $pathElements[0];
for($i=1;$i<count($pathElements);$i++)
{
$commonPath = array_intersect_assoc($commonPath,$pathElements[$i]);
}
if(is_array($commonPath) return implode("/",$commonPath);
else return null;
}
function removeCommonPath($pathArray)
{
$commonPath = getCommonPath($pathArray());
for($i=0;$i<count($pathArray);$i++)
{
$pathArray[$i] = substr($pathArray[$i],str_len($commonPath));
}
return $pathArray;
}
This is untested, but, the idea is that the $commonPath
array only ever contains the elements of the path that have been contained in all path arrays that have been compared against it. When the loop is complete, we simply recombine it with / to get the true $commonPath
Update
As pointed out by Felix Kling, array_intersect
won't consider paths that have common elements but in different orders... To solve this, I used array_intersect_assoc
instead of array_intersect
Update Added code to remove the common path (or tetris it!) from the array as well.
$common = PHP_INT_MAX;
foreach ($a as $item) {
$common = min($common, str_common($a[0], $item, $common));
}
$result = array();
foreach ($a as $item) {
$result[] = substr($item, $common);
}
print_r($result);
function str_common($a, $b, $max)
{
$pos = 0;
$last_slash = 0;
$len = min(strlen($a), strlen($b), $max + 1);
while ($pos < $len) {
if ($a{$pos} != $b{$pos}) return $last_slash;
if ($a{$pos} == '/') $last_slash = $pos;
$pos++;
}
return $last_slash;
}
This has de advantage of not having linear time complexity; however, for most cases the sort will definitely not be the operation taking more time.
Basically, the clever part (at least I couldn't find a fault with it) here is that after sorting you will only have to compare the first path with the last.
sort($a);
$a = array_map(function ($el) { return explode("/", $el); }, $a);
$first = reset($a);
$last = end($a);
for ($eqdepth = 0; $first[$eqdepth] === $last[$eqdepth]; $eqdepth++) {}
array_walk($a,
function (&$el) use ($eqdepth) {
for ($i = 0; $i < $eqdepth; $i++) {
array_shift($el);
}
});
$res = array_map(function ($el) { return implode("/", $el); }, $a);