Algorithm to tell if two arrays have identical members

前端 未结 16 2913
渐次进展
渐次进展 2020-11-29 06:52

What\'s the best algorithm for comparing two arrays to see if they have the same members?

Assume there are no duplicates, the members can be in any order, and that n

相关标签:
16条回答
  • 2020-11-29 06:53

    What is the "best" solution obviously depends on what constraints you have. If it's a small data set, the sorting, hashing, or brute force comparison (like nickf posted) will all be pretty similar. Because you know that you're dealing with integer values, you can get O(n) sort times (e.g. radix sort), and the hash table will also use O(n) time. As always, there are drawbacks to each approach: sorting will either require you to duplicate the data or destructively sort your array (losing the current ordering) if you want to save space. A hash table will obviously have memory overhead to for creating the hash table. If you use nickf's method, you can do it with little-to-no memory overhead, but you have to deal with the O(n2) runtime. You can choose which is best for your purposes.

    0 讨论(0)
  • 2020-11-29 06:54

    This can be done in different ways:

    1 - Brute force: for each element in array1 check that element exists in array2. Note this would require to note the position/index so that duplicates can be handled properly. This requires O(n^2) with much complicated code, don't even think of it at all...

    2 - Sort both lists, then check each element to see if they're identical. O(n log n) for sorting and O(n) to check so basically O(n log n), sort can be done in-place if messing up the arrays is not a problem, if not you need to have 2n size memory to copy the sorted list.

    3 - Add the items and count from one array to a hashtable, then iterate through the other array, checking that each item is in the hashtable and in that case decrement count if it is not zero otherwise remove it from hashtable. O(n) to create a hashtable, and O(n) to check the other array items in the hashtable, so O(n). This introduces a hashtable with memory at most for n elements.

    4 - Best of Best (Among the above): Subtract or take difference of each element in the same index of the two arrays and finally sum up the subtacted values. For eg A1={1,2,3}, A2={3,1,2} the Diff={-2,1,1} now sum-up the Diff = 0 that means they have same set of integers. This approach requires an O(n) with no extra memory. A c# code would look like as follows:

        public static bool ArrayEqual(int[] list1, int[] list2)
        {
            if (list1 == null || list2 == null)
            {
                throw new Exception("Invalid input");
            }
    
            if (list1.Length != list2.Length)
            {
                return false;
            }
    
            int diff = 0;
    
            for (int i = 0; i < list1.Length; i++)
            {
                diff += list1[i] - list2[i];
            }
    
            return (diff == 0);
        }
    

    4 doesn't work at all, it is the worst

    0 讨论(0)
  • 2020-11-29 06:56

    If the elements of an array are given as distinct, then XOR ( bitwise XOR ) all the elements of both the arrays, if the answer is zero, then both the arrays have the same set of numbers. The time complexity is O(n)

    0 讨论(0)
  • 2020-11-29 06:57

    You could load one into a hash table, keeping track of how many elements it has. Then, loop over the second one checking to see if every one of its elements is in the hash table, and counting how many elements it has. If every element in the second array is in the hash table, and the two lengths match, they are the same, otherwise they are not. This should be O(N).

    To make this work in the presence of duplicates, track how many of each element has been seen. Increment while looping over the first array, and decrement while looping over the second array. During the loop over the second array, if you can't find something in the hash table, or if the counter is already at zero, they are unequal. Also compare total counts.

    Another method that would work in the presence of duplicates is to sort both arrays and do a linear compare. This should be O(N*log(N)).

    0 讨论(0)
  • 2020-11-29 06:59

    Obvious answers would be:

    1. Sort both lists, then check each element to see if they're identical
    2. Add the items from one array to a hashtable, then iterate through the other array, checking that each item is in the hash
    3. nickf's iterative search algorithm

    Which one you'd use would depend on whether you can sort the lists first, and whether you have a good hash algorithm handy.

    0 讨论(0)
  • 2020-11-29 06:59

    If you sort both arrays first, you'd get O(N log(N)).

    0 讨论(0)
提交回复
热议问题