Find top N elements in an Array

前端 未结 12 1134
南笙
南笙 2020-11-28 06:18

What would be the best solution to find top N (say 10) elements in an unordered list (of say 100).

The solution which came in my head was to 1. sort it using quick s

相关标签:
12条回答
  • 2020-11-28 06:50

    The best solution is to use whatever facilities your chosen language provides which will make your life easier.

    However, assuming this was a question more related to what algorithm you should choose, I'm going to suggest a different approach here. If you're talking about 10 from 100, you shouldn't generally worry too much about performance unless you want to do it many times per second.

    For example, this C code (which is about as inefficient as I can make it without being silly) still takes well under a tenth of a second to execute. That's not enough time for me to even think about going to get a coffee.

    #include <stdio.h>
    #include <stdlib.h>
    #include <time.h>
    
    #define SRCSZ 100
    #define DSTSZ 10
    
    int main (void) {
        int unused[SRCSZ], source[SRCSZ], dest[DSTSZ], i, j, pos;
    
        srand (time (NULL));
        for (i = 0; i < SRCSZ; i++) {
            unused[i] = 1;
            source[i] = rand() % 1000;
        }
    
        for (i = 0; i < DSTSZ; i++) {
            pos = -1;
            for (j = 0; j < SRCSZ; j++) {
                if (pos == -1) {
                    if (unused[j]) {
                        pos = j;
                    }
                } else {
                    if (unused[j] && (source[j] > source[pos])) {
                        pos = j;
                    }
                }
            }
            dest[i] = source[pos];
            unused[pos] = 0;
        }
    
        printf ("Source:");
        for (i = 0; i < SRCSZ; i++) printf (" %d", source[i]);
        printf ("\nDest:");
        for (i = 0; i < DSTSZ; i++) printf (" %d", dest[i]);
        printf ("\n");
    
        return 0;
    }
    

    Running it through time gives you (I've formatted the output a bit to make it readable, but haven't affected the results):

    Source: 403 459 646 467 120 346 430 247 68 312 701 304 707 443
            753 433 986 921 513 634 861 741 482 794 679 409 145 93
            512 947 19 9 385 208 795 742 851 638 924 637 638 141
            382 89 998 713 210 732 784 67 273 628 187 902 42 25
            747 471 686 504 255 74 638 610 227 892 156 86 48 133
            63 234 639 899 815 986 750 177 413 581 899 494 292 359
            60 106 944 926 257 370 310 726 393 800 986 827 856 835
            66 183 901
    Dest: 998 986 986 986 947 944 926 924 921 902
    
    real    0m0.063s
    user    0m0.046s
    sys     0m0.031s
    

    Only once the quantities of numbers become large should you usually worry. Don't get me wrong, I'm not saying you shouldn't think about performance. What you shouldn't do is spend too much time optimising things that don't matter - YAGNI and all that jazz.

    As with all optimisation questions, measure don't guess!

    0 讨论(0)
  • 2020-11-28 06:51

    Yes there is a way to do better than quicksort. As pointed by Yin Zhu, you can search for kth largest element first and then use that element value as your pivot to split the array

    0 讨论(0)
  • 2020-11-28 06:52
    public class FindTopValuesSelectionSortImpl implements FindTopValues {
    
    /**
     * Finds list of the highest 'n' values in the source list, ordered naturally, 
     * with the highest value at the start of the array and returns it 
     */
    @Override
    public int[] findTopNValues(int[] values, int n) {
        int length = values.length;
    
        for (int i=0; i<=n; i++) {
            int maxPos = i;
            for (int j=i+1; j<length; j++) {
                if (values[j] > values[maxPos]) {
                    maxPos = j;
                }
            }
    
            if (maxPos != i) {
                int maxValue = values[maxPos];
                values[maxPos] = values[i];**strong text**
                values[i] = maxValue;
            }           
        }
        return Arrays.copyOf(values, n);        
    }
    }
    
    0 讨论(0)
  • 2020-11-28 06:55

    Yes, you can do it in O(n) by just keeping a (sorted) running list of the top N. You can sort the running list using the regular library functions or a sorting network. E.g. a simple demo using 3, and showing which elements in the running list change each iteration.

    5 2 8 7 9

    i = 0
    top[0] <= 5
    
    i = 1
    top[1] <= 2
    
    i = 2
    top[2] <= top[1] (2)
    top[1] <= top[0] (5)
    top[0] <= 8
    
    i = 3
    top[2] <= top[1] (5)
    top[1] <= 7
    
    i = 4
    top[2] <= top[1] (7)
    top[1] <= top[0] (8)
    top[0] <= 9
    
    0 讨论(0)
  • 2020-11-28 06:59

    How about delegating everything to Java ;)

    function findTopN(Array list, int n)
    {
        Set sortedSet<Integer> = new TreeSet<>(Comparators.naturalOrder());
    
        // add all elements from list to sortedSet
    
        // return the first n from sortedSet
    }
    

    I am not trying to say that this is the best way. I still think Yin Zhu's method of finding the kth largest element is the best answer.

    0 讨论(0)
  • 2020-11-28 07:00

    Written below both selection sort and insertion sort implementations. For larger data set I suggest insetion sort better than selection sort

    public interface FindTopValues
    {
      int[] findTopNValues(int[] data, int n);
    }
    

    Insertion Sort Implementation:

    public class FindTopValuesInsertionSortImpl implements FindTopValues {  
    
    /**
     * Finds list of the highest 'n' values in the source list, ordered naturally, 
     * with the highest value at the start of the array and returns it 
     */
    @Override
    public int[] findTopNValues(int[] values, int n) {
    
        int length = values.length;
        for (int i=1; i<length; i++) {
            int curPos = i;
            while ((curPos > 0) && (values[i] > values[curPos-1])) {
                curPos--;
            }
    
            if (curPos != i) {
                int element = values[i];
                System.arraycopy(values, curPos, values, curPos+1, (i-curPos));
                values[curPos] = element;
            }
        }       
    
        return Arrays.copyOf(values, n);        
    }   
    
    }
    

    Selection Sort Implementation:

    public class FindTopValuesSelectionSortImpl implements FindTopValues {
    
    /**
     * Finds list of the highest 'n' values in the source list, ordered naturally, 
     * with the highest value at the start of the array and returns it 
     */
    @Override
    public int[] findTopNValues(int[] values, int n) {
        int length = values.length;
    
        for (int i=0; i<=n; i++) {
            int maxPos = i;
            for (int j=i+1; j<length; j++) {
                if (values[j] > values[maxPos]) {
                    maxPos = j;
                }
            }
    
            if (maxPos != i) {
                int maxValue = values[maxPos];
                values[maxPos] = values[i];
                values[i] = maxValue;
            }           
        }
        return Arrays.copyOf(values, n);        
    }
    }
    
    0 讨论(0)
提交回复
热议问题