Sorting an array of strings by arbitrary numbers of substrings

对着背影说爱祢 提交于 2020-12-06 08:27:51

问题


I'm attempting to modify the OrderedImportsFixer class in php-cs-fixer so I can clean up my files the way I want. What I want is to order my imports in a fashion similar to what you'd see in a filesystem listing, with "directories" listed before "files".

So, given this array:

$indexes = [
    26 => ["namespace" => "X\\Y\\Zed"],
    9 =>  ["namespace" => "A\\B\\See"],
    3 =>  ["namespace" => "A\\B\\Bee"],
    38 => ["namespace" => "A\\B\\C\\Dee"],
    51 => ["namespace" => "X\\Wye"],
    16 => ["namespace" => "A\\Sea"],
    12 => ["namespace" => "A\\Bees"],
    31 => ["namespace" => "M"],
];

I'd like this output:

$sorted = [
    38 => ["namespace" => "A\\B\\C\\Dee"],
    3 =>  ["namespace" => "A\\B\\Bee"],
    9 =>  ["namespace" => "A\\B\\See"],
    12 => ["namespace" => "A\\Bees"],
    16 => ["namespace" => "A\\Sea"],
    26 => ["namespace" => "X\\Y\\Zed"],
    51 => ["namespace" => "X\\Wye"],
    31 => ["namespace" => "M"],
];

As in a typical filesystem listing:

I've been going at uasort for a while (key association must be maintained) and have come close. Admittedly, this is due more to desperate flailing than any sort of rigorous methodology. Not really having a sense of how uasort works is kind of limiting me here.

// get the maximum number of namespace components in the list
$ns_counts = array_map(function($val){
    return count(explode("\\", $val["namespace"]));
}, $indexes);
$limit = max($ns_counts);

for ($depth = 0; $depth <= $limit; $depth++) {
    uasort($indexes, function($first, $second) use ($depth, $limit) {
        $fexp = explode("\\", $first["namespace"]);
        $sexp = explode("\\", $second["namespace"]);
        if ($depth === $limit) {
            // why does this help?
            array_pop($fexp);
            array_pop($sexp);
        }
        $fexp = array_slice($fexp, 0, $depth + 1, true);
        $sexp = array_slice($sexp, 0, $depth + 1, true);
        $fimp = implode(" ", $fexp);
        $simp = implode(" ", $sexp);
        //echo "$depth: $fimp <-> $simp\n";
        return strnatcmp($fimp, $simp);
    });
}
echo json_encode($indexes, JSON_PRETTY_PRINT);

This gives me properly sorted output, but with deeper namespaces on the bottom instead of the top:

{
    "31": {
        "namespace": "M"
    },
    "12": {
        "namespace": "A\\Bees"
    },
    "16": {
        "namespace": "A\\Sea"
    },
    "3": {
        "namespace": "A\\B\\Bee"
    },
    "9": {
        "namespace": "A\\B\\See"
    },
    "38": {
        "namespace": "A\\B\\C\\Dee"
    },
    "51": {
        "namespace": "X\\Wye"
    },
    "26": {
        "namespace": "X\\Y\\Zed"
    }
}

I'm thinking I may have to build a separate array for each level of namespace and sort it separately, but have drawn a blank on how I might do that. Any suggestions for getting the last step of this working, or something completely different that doesn't involve so many loops?


回答1:


I believe the following should work:

uasort($indexes, static function (array $entry1, array $entry2): int {  
    $ns1Parts = explode('\\', $entry1['namespace']);
    $ns2Parts = explode('\\', $entry2['namespace']);

    $ns1Length = count($ns1Parts);
    $ns2Length = count($ns2Parts);

    for ($i = 0; $i < $ns1Length && isset($ns2Parts[$i]); $i++) {
        $isLastPartForNs1 = $i === $ns1Length - 1;
        $isLastPartForNs2 = $i === $ns2Length - 1;

        if ($isLastPartForNs1 !== $isLastPartForNs2) {
            return $isLastPartForNs1 <=> $isLastPartForNs2;
        }

        $nsComparison = $ns1Parts[$i] <=> $ns2Parts[$i];

        if ($nsComparison !== 0) {
            return $nsComparison;
        }
    }

    return 0;
});

What it does is:

  • split namespaces into parts,
  • compare each part starting from the first one, and:
    • if we're at the last part for one and not the other, prioritize the one with the most parts,
    • otherwise, if the respective parts are different, prioritize the one that is before the other one alphabetically.

Demo




回答2:


Here's another version that breaks the steps down further that, although it might not be the most optimal, definitely helps my brain think about it. See the comments for more details on what is going on:

uasort(
    $indexes,
    static function (array $a, array $b) {

        $aPath = $a['namespace'];
        $bPath = $b['namespace'];

        // Just in case there are duplicates
        if ($aPath === $bPath) {
            return 0;
        }

        // Break into parts
        $aParts = explode('\\', $aPath);
        $bParts = explode('\\', $bPath);

        // If we only have a single thing then it is a root-level, just compare the item
        if (1 === count($aParts) && 1 === count($bParts)) {
            return $aPath <=> $bPath;
        }

        // Get the class and namespace (file and folder) parts
        $aClass = array_pop($aParts);
        $bClass = array_pop($bParts);

        $aNamespace = implode('\\', $aParts);
        $bNamespace = implode('\\', $bParts);

        // If the namespaces are the same, sort by class name
        if ($aNamespace === $bNamespace) {
            return $aClass <=> $bClass;
        }

        // If the first namespace _starts_ with the second namespace, sort it first
        if (0 === mb_strpos($aNamespace, $bNamespace)) {
            return -1;
        }

        // Same as above but the other way
        if (0 === mb_strpos($bNamespace, $aNamespace)) {
            return 1;
        }

        // Just only by namespace
        return $aNamespace <=> $bNamespace;
    }
);

Online demo




回答3:


We divide this into 4 steps.

Step 1: Create hierarchical structure from the dataset.

function createHierarchicalStructure($indexes){
    $data = [];
    foreach($indexes as $d){
        $temp = &$data;
        foreach(explode("\\",$d['namespace']) as $namespace){
            if(!isset($temp[$namespace])){
                $temp[$namespace] = [];
            }
            $temp = &$temp[$namespace];
        }
    }
    
    return $data;
}

Split the namespaces by \\ and maintain a $data variable. Use & address reference to keep editing the same copy of the array.

Step 2: Sort the hierarchy in first folders then files fashion.

function fileSystemSorting(&$indexes){
    foreach($indexes as $key => &$value){
        fileSystemSorting($value);
    }
    
    uksort($indexes,function($key1,$key2) use ($indexes){
        if(count($indexes[$key1]) == 0 && count($indexes[$key2]) > 0) return 1;
        if(count($indexes[$key2]) == 0 && count($indexes[$key1]) > 0) return -1;
        return strnatcmp($key1,$key2);
    });
}

Sort the subordinate folders and use uksort for the current level of folders. Vice-versa would also work. If both 2 folders in comparison have subfolders, compare them as strings, else if one is a folder and another is a file, make folders come above.

Step 3: Flatten the hierarchical structure now that they are in order.

function flattenFileSystemResults($hierarchical_data){
    $result = [];
    foreach($hierarchical_data as $key => $value){
        if(count($value) > 0){
            $sub_result = flattenFileSystemResults($value);
            foreach($sub_result as $r){
                $result[] = $key . "\\" . $r;
            }   
        }else{
            $result[] = $key;
        }
    }
    
    return $result;
}

Step 4: Restore the initial data keys back and return the result.

function associateKeys($data,$indexes){
    $map = array_combine(array_column($indexes,'namespace'),array_keys($indexes));
    $result = [];
    foreach($data as $val){
        $result[ $map[$val] ] = ['namespace' => $val];
    }
    return $result;
}

Driver code:

function foldersBeforeFiles($indexes){
   $hierarchical_data = createHierarchicalStructure($indexes);
   fileSystemSorting($hierarchical_data);
   return associateKeys(flattenFileSystemResults($hierarchical_data),$indexes);
}

print_r(foldersBeforeFiles($indexes));

Demo: https://3v4l.org/cvoB2



来源:https://stackoverflow.com/questions/64471578/sorting-an-array-of-strings-by-arbitrary-numbers-of-substrings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!