groupBy-like function such that the binary predicate holds between consecutive elements of each group instead of any two

为君一笑 提交于 2021-02-08 13:57:14

问题


On Hackage I see that groupBy's implementation is this:

groupBy                 :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy _  []           =  []
groupBy eq (x:xs)       =  (x:ys) : groupBy eq zs
                           where (ys,zs) = span (eq x) xs

which means that the preticate eq holds between any two elements of each group. Examples:

> difference_eq_1 = ((==1).) . flip (-)
> first_isnt_newline = ((/= '\n').) . const
>
> Data.List.groupBy difference_eq_1 ([1..10] ++ [11,13..21])
[[1,2],[3,4],[5,6],[7,8],[9,10],[11],[13],[15],[17],[19],[21]]
>
> Data.List.groupBy first_isnt_newline "uno\ndue\ntre"
["uno\ndue\ntre"]

What if instead I want to group elements such that the predicate holds between any pair of consecutive elements, so that the above results would be as follows?

[[1,2,3,4,5,6,7,8,9,10,11],[13],[15],[17],[19],[21]]
["uno\n","due\n","tre"]

I wrote it myself, and it looks a bit ugly

groupBy' :: (a -> a -> Bool) -> [a] -> [[a]]
groupBy' p = foldr step []
  where step elem [] = [[elem]]
        step elem gs'@((g'@(prev:g)):gs)
          | elem `p` prev = (elem:g'):gs
          | otherwise = [elem]:gs'

So I was wandering if such a function exists already and I just don't find it.

As regards the second usage, Data.List.groupBy first_isnt_newline, where the binary predicate basically ignores the second argument and applies a unary predicate to the first, I've just found that Data.List.HT.segmentAfter unary_predicate does the job, where unary_predicate is the negation of the unary predicate in which const's output is forwarded. In other words Data.List.groupBy ((/= '\n').) . const === Data.List.HT.segmentAfter (=='\n').


回答1:


There is a groupBy package that does exactly that.

But here’s another way of implementing it:

  • Zip the list with its tail to test the predicate on adjacent elements

  • Generate a “group index” by scanning the result and incrementing the group whenever the predicate is false

  • Group by the index

  • Remove the indices

groupByAdjacent :: (a -> a -> Bool) -> [a] -> [[a]]
groupByAdjacent p xs
  = fmap (fmap fst)
  $ groupBy ((==) `on` snd)
  $ zip xs
  $ scanl' (\ g (a, b) -> if p a b then g else succ g) 0
  $ zip xs
  $ drop 1 xs

For an input like [1, 2, 3, 10, 11, 20, 30], the predicate will return [True, True, False, True, False, False] and the resulting group indices will be [0, 0, 0, 1, 1, 2, 3].

The scan can also be written pointfree as scanr (bool succ id . uncurry p) 0, since the scan direction doesn’t matter (although the group indices will be reversed). The group index might be handy or just more readable to keep as an integer, but it could be a Bool instead, because the minimum size of a group is 1: the functional argument of the scan would be bool not id . uncurry p, which can be simplified to (==) . uncurry p. And several of these parts could be factored into reusable functions, like zipNext = zip <*> drop 1, but I’ve inlined them for simplicity’s sake.



来源:https://stackoverflow.com/questions/64248500/groupby-like-function-such-that-the-binary-predicate-holds-between-consecutive-e

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!