Computing the mean of a list efficiently in Haskell

后端 未结 6 1923
傲寒
傲寒 2021-02-04 11:21

I\'ve designed a function to compute the mean of a list. Although it works fine, but I think it may not be the best solution due to it takes two functions rather than one. Is it

6条回答
  •  闹比i
    闹比i (楼主)
    2021-02-04 11:37

    For those who are curious to know what glowcoder's and Assaf's approach would look like in Haskell, here's one translation:

    avg [] = 0
    avg x@(t:ts) = let xlen = toRational $ length x
                       tslen = toRational $ length ts
                       prevAvg = avg ts
                   in (toRational t) / xlen + prevAvg * tslen / xlen
    

    This way ensures that each step has the "average so far" correctly calculated, but does so at the cost of a whole bunch of redundant multiplying/dividing by lengths, and very inefficient calculations of length at each step. No seasoned Haskeller would write it this way.

    An only slightly better way is:

    avg2 [] = 0
    avg2 x = fst $ avg_ x
        where 
          avg_ [] = (toRational 0, toRational 0)
          avg_ (t:ts) = let
               (prevAvg, prevLen) = avg_ ts
               curLen = prevLen + 1
               curAvg = (toRational t) / curLen + prevAvg * prevLen / curLen
            in (curAvg, curLen)
    

    This avoids repeated length calculation. But it requires a helper function, which is precisely what the original poster is trying to avoid. And it still requires a whole bunch of canceling out of length terms.

    To avoid the cancelling out of lengths, we can just build up the sum and length and divide at the end:

    avg3 [] = 0
    avg3 x = (toRational total) / (toRational len)
        where 
          (total, len) = avg_ x
          avg_ [] = (0, 0)
          avg_ (t:ts) = let 
              (prevSum, prevLen) = avg_ ts
           in (prevSum + t, prevLen + 1)
    

    And this can be much more succinctly written as a foldr:

    avg4 [] = 0
    avg4 x = (toRational total) / (toRational len)
        where
          (total, len) = foldr avg_ (0,0) x
          avg_ t (prevSum, prevLen) = (prevSum + t, prevLen + 1)
    

    which can be further simplified as per the posts above.

    Fold really is the way to go here.

提交回复
热议问题