Code Golf: Shortest code to find a weighted median?

后端 未结 7 573
眼角桃花
眼角桃花 2021-01-03 08:23

My try at code golfing.

The problem of finding the minimum value of ∑W_i*|X-X_i| reduces to finding the weighted median of a list of x[i] with weights <

相关标签:
7条回答
  • 2021-01-03 08:55

    Haskell code, ungolfed: trying for a reasonable functional solution.

    import Data.List (zip4)
    import Data.Maybe (listToMaybe)
    
    mid :: (Num a, Ord a) => [a] -> (Int, Bool)
    mid w = (i, total == part && maybe False (l ==) r) where
        (i, l, r, part):_ = dropWhile less . zip4 [0..] w v $ map (2*) sums
        _:sums = scanl (+) 0 w; total = last sums; less (_,_,_,x) = x < total
        v = map Just w ++ repeat Nothing
    
    wmedian :: (Num a, Ord a) => [a] -> [a] -> (a, Maybe a)
    wmedian w x = (left, if rem then listToMaybe rest else Nothing) where
        (i, rem) = mid w; left:rest = drop i x
    
    > wmedian [1,1,1,1] [1,2,3,4]
    (2,Just 3)
    > wmedian [1,1,2,1] [1,2,3,4]
    (3,Nothing)
    > wmedian [1,2,2,5] [1,2,3,4]
    (3,Just 4)
    > wmedian [1,2,2,6] [1,2,3,4]
    (4,Nothing)
    
    > wmedian [1..10] [0..9]
    (6,Nothing)
    > wmedian ([1..6]++[6..8]) [1..9]
    (6,Just 7)
    

    My original J solution was a straightforward translation of the above Haskell code.

    Here's a Haskell translation of the current J code:

    {-# LANGUAGE ParallelListComp #-}
    import Data.List (find); import Data.Maybe (fromJust)
    w&x=foldr((+).fst.fromJust.find((>=sum w).snd))0[f.g(+)0$map
        (2*)w|f<-[zip x.tail,reverse.zip x]|g<-[scanl,scanr]]/2
    

    Yeah… please don't write code like this.

    > [1,1,1,1]&[1,2,3,4]
    2.5
    > [1,1,2,1]&[1,2,3,4]
    3
    > [1,2,2,5]&[1,2,3,4]
    3.5
    > [1,2,2,6]&[1,2,3,4]
    4
    > [1..10]&[0..9]
    6
    > ([1..6]++[6..8])&[1..9]
    6.5
    
    0 讨论(0)
  • 2021-01-03 09:05

    short, and does what you'd expect. Not particularly space-efficient.

    def f(l,i):
       x,y=[],sum(i)
       map(x.extend,([m]*n for m,n in zip(l,i)))
       return (x[y/2]+x[(y-1)/2])/2.
    

    here's the constant-space version using itertools. it still has to iterate sum(i)/2 times so it won't beat the index-calculating algorithms.

    from itertools import *
    def f(l,i):
       y=sum(i)-1
       return sum(islice(
           chain(*([m]*n for m,n in zip(l,i))),
           y/2,
           (y+1)/2+1
       ))/(y%2+1.)
    
    0 讨论(0)
  • 2021-01-03 09:05

    Just a comment about your code : I really hope I will not have to maintain it, unless you also wrote all the unit tests that are required here :-)

    It is not related to your question of course, but usually, the "shortest way to code" is also the "hardest way to maintain". For scientific applications, it is probably not a show stopper. But for IT applications, it is.

    I think it has to be said. All the best.

    0 讨论(0)
  • 2021-01-03 09:06

    Python:

    a=sum([[X]*W for X,W in zip(x,w)],[]);l=len(a);a[l/2]+a[(l-1)/2]
    
    0 讨论(0)
  • 2021-01-03 09:08

    So, here's how I could squeeze my own solution:, still leaving some whitespaces:

        int s = 0, i = 0;
        for (; i < n; s += w[i++]) ;
        while ( (s -= 2*w[--i] ) > 0) ;
        a  =  x[i]  +  x[ !s && (w[i]==w[i-1]) ? i-1 : i ]; 
    
    0 讨论(0)
  • 2021-01-03 09:08

    Something like this? O(n) running time.

    for(int i = 0; i < x.length; i++)
    {
    sum += x[i] * w[i];
    sums.push(sum);
    }
    
    median = sum/2;
    
    for(int i = 0; i < array.length - 1; i++)
    {
        if(median > sums[element] and median < sums[element+1]
             return x[i];
        if(median == sums[element])
             return (x[i] + x[i+1])/2
    }
    

    Not sure how you can get two answers for the median, do you mean if sum/2 is exactly equal to a boundary?

    EDIT: After looking at your formatted code, my code does essentially the same thing, did you want a MORE efficient method?

    EDIT2: The search part can be done using a modified binary search, that would make it slightly faster.

    index = sums.length /2;
    finalIndex = binarySearch(index);
    
    int binarySearch(i)
    {
        if(median > sums[i+1])
        {
            i += i/2
            return binarySearch(i);
        }
        else if(median < sums[i])
        {
            i -= i/2
            return binarySearch(i);
        }
        return i;
    }
    

    Will have to do some checking to make sure it doesn't go on infinitely on edge cases.

    0 讨论(0)
提交回复
热议问题