Tetris-ing an array

后端 未结 16 1411
时光取名叫无心
时光取名叫无心 2021-01-30 15:38

Consider the following array:

/www/htdocs/1/sites/lib/abcdedd
/www/htdocs/1/sites/conf/xyz
/www/htdocs/1/sites/conf/abc/         


        
相关标签:
16条回答
  • 2021-01-30 15:56

    Well, considering that you can use XOR in this situation to find the common parts of the string. Any time you xor two bytes that are the same, you get a nullbyte as the output. So we can use that to our advantage:

    $first = $array[0];
    $length = strlen($first);
    $count = count($array);
    for ($i = 1; $i < $count; $i++) {
        $length = min($length, strspn($array[$i] ^ $first, chr(0)));
    }
    

    After that single loop, the $length variable will be equal to the longest common basepart between the array of strings. Then, we can extract the common part from the first element:

    $common = substr($array[0], 0, $length);
    

    And there you have it. As a function:

    function commonPrefix(array $strings) {
        $first = $strings[0];
        $length = strlen($first);
        $count = count($strings);
        for ($i = 1; $i < $count; $i++) {
            $length = min($length, strspn($strings[$i] ^ $first, chr(0)));
        }
        return substr($first, 0, $length);
    }
    

    Note that it does use more than one iteration, but those iterations are done in libraries, so in interpreted languages this will have a huge efficiency gain...

    Now, if you want only full paths, we need to truncate to the last / character. So:

    $prefix = preg_replace('#/[^/]*$', '', commonPrefix($paths));
    

    Now, it may overly cut two strings such as /foo/bar and /foo/bar/baz will be cut to /foo. But short of adding another iteration round to determine if the next character is either / or end-of-string, I can't see a way around that...

    0 讨论(0)
  • 2021-01-30 15:57

    Ok, I'm not sure this is bullet-proof, but I think it works:

    echo array_reduce($array, function($reducedValue, $arrayValue) {
        if($reducedValue === NULL) return $arrayValue;
        for($i = 0; $i < strlen($reducedValue); $i++) {
            if(!isset($arrayValue[$i]) || $arrayValue[$i] !== $reducedValue[$i]) {
                return substr($reducedValue, 0, $i);
            }
        }
        return $reducedValue;
    });
    

    This will take the first value in the array as reference string. Then it will iterate over the reference string and compare each char with the char of the second string at the same position. If a char doesnt match, the reference string will be shortened to the position of the char and the next string is compared. The function will return the shortest matching string then.

    Performance depends on the strings given. The earlier the reference string gets shorter, the quicker the code will finish. I really have no clue how to put that in a formula though.

    I found that Artefacto's approach to sort the strings increases performance. Adding

    asort($array);
    $array = array(array_shift($array), array_pop($array));
    

    before the array_reduce will significantly increase performance.

    Also note that this will return the longest matching initial substring, which is more versatile but wont give you the common path. You have to run

    substr($result, 0, strrpos($result, '/'));
    

    on the result. And then you can use the result to remove the values

    print_r(array_map(function($v) use ($path){
        return str_replace($path, '', $v);
    }, $array));
    

    which should give:

    [0] => /lib/abcdedd
    [1] => /conf/xyz/
    [2] => /conf/abc/def
    [3] => /htdocs/xyz
    [4] => /lib2/abcdedd
    

    Feedback welcome.

    0 讨论(0)
  • 2021-01-30 15:57
    $arrMain = array(
                '/www/htdocs/1/sites/lib/abcdedd',
                '/www/htdocs/1/sites/conf/xyz',
                '/www/htdocs/1/sites/conf/abc/def',
                '/www/htdocs/1/sites/htdocs/xyz',
                '/www/htdocs/1/sites/lib2/abcdedd'
    );
    function explodePath( $strPath ){ 
        return explode("/", $strPath);
    }
    
    function removePath( $strPath)
    {
        global $strCommon;
        return str_replace( $strCommon, '', $strPath );
    }
    $arrExplodedPaths = array_map( 'explodePath', $arrMain ) ;
    
    //Check for common and skip first 1
    $strCommon = '';
    for( $i=1; $i< count( $arrExplodedPaths[0] ); $i++)
    {
        for( $j = 0; $j < count( $arrExplodedPaths); $j++ )
        {
            if( $arrExplodedPaths[0][ $i ] !== $arrExplodedPaths[ $j ][ $i ] )
            {
                break 2;
            } 
        }
        $strCommon .= '/'.$arrExplodedPaths[0][$i];
    }
    print_r( array_map( 'removePath', $arrMain ) );
    

    This works fine... similar to mark baker but uses str_replace

    0 讨论(0)
  • 2021-01-30 15:58

    I would explode the values based on the / and then use array_intersect_assoc to detect the common elements and ensure they have the correct corresponding index in the array. The resulting array could be recombined to produce the common path.

    function getCommonPath($pathArray)
    {
        $pathElements = array();
    
        foreach($pathArray as $path)
        {
            $pathElements[] = explode("/",$path);
        }
    
        $commonPath = $pathElements[0];
    
        for($i=1;$i<count($pathElements);$i++)
        {
            $commonPath = array_intersect_assoc($commonPath,$pathElements[$i]);
        }
    
        if(is_array($commonPath) return implode("/",$commonPath);
        else return null;
    }
    
    function removeCommonPath($pathArray)
    {
        $commonPath = getCommonPath($pathArray());
    
        for($i=0;$i<count($pathArray);$i++)
        {
            $pathArray[$i] = substr($pathArray[$i],str_len($commonPath));
        }
    
        return $pathArray;
    }
    

    This is untested, but, the idea is that the $commonPath array only ever contains the elements of the path that have been contained in all path arrays that have been compared against it. When the loop is complete, we simply recombine it with / to get the true $commonPath

    Update As pointed out by Felix Kling, array_intersect won't consider paths that have common elements but in different orders... To solve this, I used array_intersect_assoc instead of array_intersect

    Update Added code to remove the common path (or tetris it!) from the array as well.

    0 讨论(0)
  • 2021-01-30 16:02
    $common = PHP_INT_MAX;
    foreach ($a as $item) {
            $common = min($common, str_common($a[0], $item, $common));
    }
    
    $result = array();
    foreach ($a as $item) {
            $result[] = substr($item, $common);
    }
    print_r($result);
    
    function str_common($a, $b, $max)
    {
            $pos = 0;
            $last_slash = 0;
            $len = min(strlen($a), strlen($b), $max + 1);
            while ($pos < $len) {
                    if ($a{$pos} != $b{$pos}) return $last_slash;
                    if ($a{$pos} == '/') $last_slash = $pos;
                    $pos++;
            }
            return $last_slash;
    }
    
    0 讨论(0)
  • 2021-01-30 16:02

    This has de advantage of not having linear time complexity; however, for most cases the sort will definitely not be the operation taking more time.

    Basically, the clever part (at least I couldn't find a fault with it) here is that after sorting you will only have to compare the first path with the last.

    sort($a);
    $a = array_map(function ($el) { return explode("/", $el); }, $a);
    $first = reset($a);
    $last = end($a);
    for ($eqdepth = 0; $first[$eqdepth] === $last[$eqdepth]; $eqdepth++) {}
    array_walk($a,
        function (&$el) use ($eqdepth) {
            for ($i = 0; $i < $eqdepth; $i++) {
                array_shift($el);
            }
         });
    $res = array_map(function ($el) { return implode("/", $el); }, $a);
    
    0 讨论(0)
提交回复
热议问题