How to calculate the median of an array?

后端 未结 14 1677
北海茫月
北海茫月 2020-12-05 06:10

I\'m trying to calculate the total, mean and median of an array thats populated by input received by a textfield. I\'ve managed to work out the total and the mean, I just ca

相关标签:
14条回答
  • 2020-12-05 06:59

    Sorting the array is unnecessary and inefficient. There's a variation of the QuickSort (QuickSelect) algorithm which has an average run time of O(n); if you sort first, you're down to O(n log n). It actually finds the nth smallest item in a list; for a median, you just use n = half the list length. Let's call it quickNth (list, n).

    The concept is that to find the nth smallest, choose a 'pivot' value. (Exactly how you choose it isn't critical; if you know the data will be thoroughly random, you can take the first item on the list.)

    Split the original list into three smaller lists:

    • One with values smaller than the pivot.
    • One with values equal to the pivot.
    • And one with values greater than the pivot.

    You then have three cases:

    1. The "smaller" list has >= n items. In that case, you know that the nth smallest is in that list. Return quickNth(smaller, n).
    2. The smaller list has < n items, but the sum of the lengths of the smaller and equal lists have >= n items. In this case, the nth is equal to any item in the "equal" list; you're done.
    3. n is greater than the sum of the lengths of the smaller and equal lists. In that case, you can essentially skip over those two, and adjust n accordingly. Return quickNth(greater, n - length(smaller) - length(equal)).

    Done.

    If you're not sure that the data is thoroughly random, you need to be more sophisticated about choosing the pivot. Taking the median of the first value in the list, the last value in the list, and the one midway between the two works pretty well.

    If you're very unlucky with your choice of pivots, and you always choose the smallest or highest value as your pivot, this takes O(n^2) time; that's bad. But, it's also very unlikely if you choose your pivot with a decent algorithm.

    Sample code:

    import java.util.*;
    
    public class Utility {
       /****************
       * @param coll an ArrayList of Comparable objects
       * @return the median of coll
       *****************/
       
       public static <T extends Number> double median(ArrayList<T> coll, Comparator<T> comp) {
          double result;
          int n = coll.size()/2;
          
          if (coll.size() % 2 == 0)  // even number of items; find the middle two and average them
             result = (nth(coll, n-1, comp).doubleValue() + nth(coll, n, comp).doubleValue()) / 2.0;
          else                      // odd number of items; return the one in the middle
             result = nth(coll, n, comp).doubleValue();
             
          return result;
       } // median(coll)
       
       
    
       /*****************
       * @param coll a collection of Comparable objects
       * @param n  the position of the desired object, using the ordering defined on the list elements
       * @return the nth smallest object
       *******************/
       
       public static <T> T nth(ArrayList<T> coll, int n, Comparator<T> comp) {
          T result, pivot;
          ArrayList<T> underPivot = new ArrayList<>(), overPivot = new ArrayList<>(), equalPivot = new ArrayList<>();
          
          // choosing a pivot is a whole topic in itself.
          // this implementation uses the simple strategy of grabbing something from the middle of the ArrayList.
          
          pivot = coll.get(n/2);
          
          // split coll into 3 lists based on comparison with the pivot
          
          for (T obj : coll) {
             int order = comp.compare(obj, pivot);
             
             if (order < 0)        // obj < pivot
                underPivot.add(obj);
             else if (order > 0)   // obj > pivot
                overPivot.add(obj);
             else                  // obj = pivot
                equalPivot.add(obj);
          } // for each obj in coll
          
          // recurse on the appropriate list
          
          if (n < underPivot.size())
             result = nth(underPivot, n, comp);
          else if (n < underPivot.size() + equalPivot.size()) // equal to pivot; just return it
             result = pivot;
          else  // everything in underPivot and equalPivot is too small.  Adjust n accordingly in the recursion.
             result = nth(overPivot, n - underPivot.size() - equalPivot.size(), comp);
             
          return result;
       } // nth(coll, n)
       
       
       
       public static void main (String[] args) {
          Comparator<Integer> comp = Comparator.naturalOrder();
          Random rnd = new Random();
          
          for (int size = 1; size <= 10; size++) {
             ArrayList<Integer> coll = new ArrayList<>(size);
             for (int i = 0; i < size; i++)
                coll.add(rnd.nextInt(100));
          
             System.out.println("Median of " + coll.toString() + " is " + median(coll, comp));
          } // for a range of possible input sizes
       } // main(args)
    } // Utility
    
    0 讨论(0)
  • 2020-12-05 07:04

    I faced a similar problem yesterday. I wrote a method with Java generics in order to calculate the median value of every collection of Numbers; you can apply my method to collections of Doubles, Integers, Floats and returns a double. Please consider that my method creates another collection in order to not alter the original one. I provide also a test, have fun. ;-)

    public static <T extends Number & Comparable<T>> double median(Collection<T> numbers){
        if(numbers.isEmpty()){
            throw new IllegalArgumentException("Cannot compute median on empty collection of numbers");
        }
        List<T> numbersList = new ArrayList<>(numbers);
        Collections.sort(numbersList);
        int middle = numbersList.size()/2;
        if(numbersList.size() % 2 == 0){
            return 0.5 * (numbersList.get(middle).doubleValue() + numbersList.get(middle-1).doubleValue());
        } else {
            return numbersList.get(middle).doubleValue();
        }
    
    }
    

    JUnit test code snippet:

    /**
     * Test of median method, of class Utils.
     */
    @Test
    public void testMedian() {
        System.out.println("median");
        Double expResult = 3.0;
        Double result = Utils.median(Arrays.asList(3.0,2.0,1.0,9.0,13.0));
        assertEquals(expResult, result);
        expResult = 3.5;
        result = Utils.median(Arrays.asList(3.0,2.0,1.0,9.0,4.0,13.0));
        assertEquals(expResult, result);
    }
    

    Usage example (consider the class name is Utils):

    List<Integer> intValues = ... //omitted init
    Set<Float> floatValues = ... //omitted init
    .....
    double intListMedian = Utils.median(intValues);
    double floatSetMedian = Utils.median(floatValues);
    

    Note: my method works on collections, you can convert arrays of numbers to list of numbers as pointed here

    0 讨论(0)
  • 2020-12-05 07:04

    As @Bruce-Feist mentions, for a large number of elements, I'd avoid any solution involving sort if performance is something you are concerned about. A different approach than those suggested in the other answers is Hoare's algorithm to find the k-th smallest of element of n items. This algorithm runs in O(n).

    public int findKthSmallest(int[] array, int k)
    {
        if (array.length < 10)
        {
            Arrays.sort(array);
            return array[k];
        }
        int start = 0;
        int end = array.length - 1;
        int x, temp;
        int i, j;
        while (start < end)
        {
            x = array[k];
            i = start;
            j = end;
            do
            {
                while (array[i] < x)
                    i++;
                while (x < array[j])
                    j--;
                if (i <= j)
                {
                    temp = array[i];
                    array[i] = array[j];
                    array[j] = temp;
                    i++;
                    j--;
                }
            } while (i <= j);
            if (j < k)
                start = i;
            if (k < i)
                end = j;
        }
        return array[k];
    }
    

    And to find the median:

    public int median(int[] array)
    {
        int length = array.length;
        if ((length & 1) == 0) // even
            return (findKthSmallest(array, array.length / 2) + findKthSmallest(array, array.length / 2 + 1)) / 2;
        else // odd
            return findKthSmallest(array, array.length / 2);
    }
    
    0 讨论(0)
  • 2020-12-05 07:05

    You can find good explanation at https://www.youtube.com/watch?time_continue=23&v=VmogG01IjYc

    The idea it to use 2 Heaps viz one max heap and mean heap.

    class Heap {
    private Queue<Integer> low = new PriorityQueue<>(Comparator.reverseOrder());
    private Queue<Integer> high = new PriorityQueue<>();
    
    public void add(int number) {
        Queue<Integer> target = low.size() <= high.size() ? low : high;
        target.add(number);
        balance();
    }
    
    private void balance() {
        while(!low.isEmpty() && !high.isEmpty() && low.peek() > high.peek()) {
            Integer lowHead= low.poll();
            Integer highHead = high.poll();
            low.add(highHead);
            high.add(lowHead);
        }
    }
    
    public double median() {
        if(low.isEmpty() && high.isEmpty()) {
            throw new IllegalStateException("Heap is empty");
        } else {
            return low.size() == high.size() ? (low.peek() + high.peek()) / 2.0 : low.peek();
        }
    }
    

    }

    0 讨论(0)
  • 2020-12-05 07:10

    Check out the Arrays.sort methods:

    http://docs.oracle.com/javase/6/docs/api/java/util/Arrays.html

    You should also really abstract finding the median into its own method, and just return the value to the calling method. This will make testing your code much easier.

    0 讨论(0)
  • 2020-12-05 07:12

    Use Arrays.sort and then take the middle element (in case the number n of elements in the array is odd) or take the average of the two middle elements (in case n is even).

      public static long median(long[] l)
      {
        Arrays.sort(l);
        int middle = l.length / 2;
        if (l.length % 2 == 0)
        {
          long left = l[middle - 1];
          long right = l[middle];
          return (left + right) / 2;
        }
        else
        {
          return l[middle];
        }
      }
    

    Here are some examples:

      @Test
      public void evenTest()
      {
        long[] l = {
            5, 6, 1, 3, 2
        };
        Assert.assertEquals((3 + 4) / 2, median(l));
      }
    
      @Test
      public oddTest()
      {
        long[] l = {
            5, 1, 3, 2, 4
        };
        Assert.assertEquals(3, median(l));
      }
    

    And in case your input is a Collection, you might use Google Guava to do something like this:

    public static long median(Collection<Long> numbers)
    {
      return median(Longs.toArray(numbers)); // requires import com.google.common.primitives.Longs;
    }
    
    0 讨论(0)
提交回复
热议问题