finding common prefix of array of strings

后端 未结 17 1584
一向
一向 2020-11-29 06:15

I have an array like this:

$sports = array(
\'Softball - Counties\',
\'Softball - Eastern\',
\'Softball - North Harbour\',
\'Softball - South\',
\'Softball -         


        
相关标签:
17条回答
  • 2020-11-29 06:41

    For what it's worth, here's another alternative I came up with.

    I used this for finding the common prefix for a list of products codes (ie. where there are multiple product SKUs that have a common series of characters at the start):

    /**
     * Try to find a common prefix for a list of strings
     * 
     * @param array $strings
     * @return string
     */
    function findCommonPrefix(array $strings)
    {
        $prefix = '';
        $chars = array_map("str_split", $strings);
        $matches = call_user_func_array("array_intersect_assoc", $chars);
        if ($matches) {
            $i = 0;
            foreach ($matches as $key => $value) {
                if ($key != $i) {
                    unset($matches[$key]);
                }
                $i++;
            }
            $prefix = join('', $matches);
        }
    
        return $prefix;
    }
    
    0 讨论(0)
  • 2020-11-29 06:43

    If you can sort your array, then there is a simple and very fast solution.

    Simply compare the first item to the last one.

    If the strings are sorted, any prefix common to all strings will be common to the sorted first and last strings.

    sort($sport);
    
    $s1 = $sport[0];               // First string
    $s2 = $sport[count($sport)-1]; // Last string
    $len = min(strlen($s1), strlen($s2));
    
    // While we still have string to compare,
    // if the indexed character is the same in both strings,
    // increment the index. 
    for ($i=0; $i<$len && $s1[$i]==$s2[$i]; $i++); 
    
    $prefix = substr($s1, 0, $i);
    
    0 讨论(0)
  • 2020-11-29 06:46

    I would use this:

    $prefix = array_shift($array);  // take the first item as initial prefix
    $length = strlen($prefix);
    // compare the current prefix with the prefix of the same length of the other items
    foreach ($array as $item) {
        // check if there is a match; if not, decrease the prefix by one character at a time
        while ($length && substr($item, 0, $length) !== $prefix) {
            $length--;
            $prefix = substr($prefix, 0, -1);
        }
        if (!$length) {
            break;
        }
    }
    

    Update   Here’s another solution, iteratively comparing each n-th character of the strings until a mismatch is found:

    $pl = 0; // common prefix length
    $n = count($array);
    $l = strlen($array[0]);
    while ($pl < $l) {
        $c = $array[0][$pl];
        for ($i=1; $i<$n; $i++) {
            if ($array[$i][$pl] !== $c) break 2;
        }
        $pl++;
    }
    $prefix = substr($array[0], 0, $pl);
    

    This is even more efficient as there are only at most numberOfStrings‍·‍commonPrefixLength atomic comparisons.

    0 讨论(0)
  • 2020-11-29 06:46
    
        // Common prefix
        $common = '';
    
        $sports = array(
        'Softball T - Counties',
        'Softball T - Eastern',
        'Softball T - North Harbour',
        'Softball T - South',
        'Softball T - Western'
        );
    
        // find mini string
        $minLen = strlen($sports[0]);
        foreach ($sports as $s){
            if($minLen > strlen($s))
                $minLen = strlen($s);
        }
    
    
        // flag to break out of inner loop
        $flag = false;
    
        // The possible common string length does not exceed the minimum string length.
        // The following solution is O(n^2), this can be improve.
        for ($i = 0 ; $i < $minLen; $i++){
            $tmp = $sports[0][$i];
    
            foreach ($sports as $s){
                if($s[$i] != $tmp)
                    $flag = true;
            }
            if($flag)
                break;
            else
                $common .= $sports[0][$i];
        }
    
        print $common;
    
    0 讨论(0)
  • 2020-11-29 06:47

    Short and sweet version, perhaps not the most efficient:

    /// Return length of longest common prefix in an array of strings.
    function _commonPrefix($array) {
        if(count($array) < 2) {
            if(count($array) == 0)
                return false; // empty array: undefined prefix
            else
                return strlen($array[0]); // 1 element: trivial case
        }
        $len = max(array_map('strlen',$array)); // initial upper limit: max length of all strings.
        $prevval = reset($array);
        while(($newval = next($array)) !== FALSE) {
            for($j = 0 ; $j < $len ; $j += 1)
                if($newval[$j] != $prevval[$j])
                    $len = $j;
            $prevval = $newval;
        }
        return $len;
    }
    
    // TEST CASE:
    $arr = array('/var/yam/yamyam/','/var/yam/bloorg','/var/yar/sdoo');
    print_r($arr);
    $plen = _commonprefix($arr);
    $pstr = substr($arr[0],0,$plen);
    echo "Res: $plen\n";
    echo "==> ".$pstr."\n";
    echo "dir: ".dirname($pstr.'aaaa')."\n";
    

    Output of the test case:

    Array
    (
        [0] => /var/yam/yamyam/
        [1] => /var/yam/bloorg
        [2] => /var/yar/sdoo
    )
    Res: 7
    ==> /var/ya
    dir: /var
    
    0 讨论(0)
  • 2020-11-29 06:48

    This is an addition to the @Gumbo answer. If you want to ensure that the chosen, common prefix does not break words, use this. I am just having it look for a blank space at the end of the chosen string. If that exists we know that there was more to all of the phrases, so we truncate it.

    function product_name_intersection($array){
    
        $pl = 0; // common prefix length
        $n = count($array);
        $l = strlen($array[0]);
        $first = current($array);
    
        while ($pl < $l) {
            $c = $array[0][$pl];
            for ($i=1; $i<$n; $i++) {
                if (!isset($array[$i][$pl]) || $array[$i][$pl] !== $c) break 2;
            }
            $pl++;
        }
        $prefix = substr($array[0], 0, $pl);
    
        if ($pl < strlen($first) && substr($prefix, -1, 1) != ' ') {
    
            $prefix = preg_replace('/\W\w+\s*(\W*)$/', '$1', $prefix);
        }
    
        $prefix =  preg_replace('/^\W*(.+?)\W*$/', '$1', $prefix);
    
        return $prefix;
    }
    
    0 讨论(0)
提交回复
热议问题