Algorithm: efficient way to remove duplicate integers from an array

前端 未结 30 2088
离开以前
离开以前 2020-11-22 16:03

I got this problem from an interview with Microsoft.

Given an array of random integers, write an algorithm in C that removes duplicated numbers an

相关标签:
30条回答
  • 2020-11-22 16:15

    A solution suggested by my girlfriend is a variation of merge sort. The only modification is that during the merge step, just disregard duplicated values. This solution would be as well O(n log n). In this approach, the sorting/duplication removal are combined together. However, I'm not sure if that makes any difference, though.

    0 讨论(0)
  • 2020-11-22 16:15

    1. Using O(1) extra space, in O(n log n) time

    This is possible, for instance:

    • first do an in-place O(n log n) sort
    • then walk through the list once, writing the first instance of every back to the beginning of the list

    I believe ejel's partner is correct that the best way to do this would be an in-place merge sort with a simplified merge step, and that that is probably the intent of the question, if you were eg. writing a new library function to do this as efficiently as possible with no ability to improve the inputs, and there would be cases it would be useful to do so without a hash-table, depending on the sorts of inputs. But I haven't actually checked this.

    2. Using O(lots) extra space, in O(n) time

    • declare a zero'd array big enough to hold all integers
    • walk through the array once
    • set the corresponding array element to 1 for each integer.
    • If it was already 1, skip that integer.

    This only works if several questionable assumptions hold:

    • it's possible to zero memory cheaply, or the size of the ints are small compared to the number of them
    • you're happy to ask your OS for 256^sizepof(int) memory
    • and it will cache it for you really really efficiently if it's gigantic

    It's a bad answer, but if you have LOTS of input elements, but they're all 8-bit integers (or maybe even 16-bit integers) it could be the best way.

    3. O(little)-ish extra space, O(n)-ish time

    As #2, but use a hash table.

    4. The clear way

    If the number of elements is small, writing an appropriate algorithm is not useful if other code is quicker to write and quicker to read.

    Eg. Walk through the array for each unique elements (ie. the first element, the second element (duplicates of the first having been removed) etc) removing all identical elements. O(1) extra space, O(n^2) time.

    Eg. Use library functions which do this. efficiency depends which you have easily available.

    0 讨论(0)
  • 2020-11-22 16:15

    How about the following?

    int* temp = malloc(sizeof(int)*len);
    int count = 0;
    int x =0;
    int y =0;
    for(x=0;x<len;x++)
    {
        for(y=0;y<count;y++)
        {
            if(*(temp+y)==*(array+x))
            {
                break;
            }
        }
        if(y==count)
        {
            *(temp+count) = *(array+x);
            count++;
        }
    }
    memcpy(array, temp, sizeof(int)*len);
    

    I try to declare a temp array and put the elements into that before copying everything back to the original array.

    0 讨论(0)
  • 2020-11-22 16:15

    This can be done in a single pass, in O(N) time in the number of integers in the input list, and O(N) storage in the number of unique integers.

    Walk through the list from front to back, with two pointers "dst" and "src" initialized to the first item. Start with an empty hash table of "integers seen". If the integer at src is not present in the hash, write it to the slot at dst and increment dst. Add the integer at src to the hash, then increment src. Repeat until src passes the end of the input list.

    0 讨论(0)
  • 2020-11-22 16:16
    import java.util.ArrayList;
    
    
    public class C {
    
        public static void main(String[] args) {
    
            int arr[] = {2,5,5,5,9,11,11,23,34,34,34,45,45};
    
            ArrayList<Integer> arr1 = new ArrayList<Integer>();
    
            for(int i=0;i<arr.length-1;i++){
    
                if(arr[i] == arr[i+1]){
                    arr[i] = 99999;
                }
            }
    
            for(int i=0;i<arr.length;i++){
                if(arr[i] != 99999){
    
                    arr1.add(arr[i]);
                }
            }
    
            System.out.println(arr1);
    }
        }
    
    0 讨论(0)
  • 2020-11-22 16:17

    You could do this in a single traversal, if you are willing to sacrifice memory. You can simply tally whether you have seen an integer or not in a hash/associative array. If you have already seen a number, remove it as you go, or better yet, move numbers you have not seen into a new array, avoiding any shifting in the original array.

    In Perl:

    foreach $i (@myary) {
        if(!defined $seen{$i}) {
            $seen{$i} = 1;
            push @newary, $i;
        }
    }
    
    0 讨论(0)
提交回复
热议问题