F#: removing duplicates from a seq is slow

后端未结

关注

 9  1085

半阙折子戏 2021-01-12 03:11

I am attempting to write a function that weeds out consecutive duplicates, as determined by a given equality function, from a seq<\'a> but with a twist:

9条回答

终归单人心 (楼主)

2021-01-12 04:10
To make efficient use of the input type Seq, one should iterate through each element only once and avoid creating additional sequences.

On the other side, to make efficient use of the output type List, one should make liberal use of the cons and tail functions, both of which are basically free.

Combining the two requirements leads me to this solution:
```
// dedupeTakingLast2 : ('a -> 'a -> bool) -> seq<'a> -> 'a list
let dedupeTakingLast2 equalityFn = 
  Seq.fold 
  <| fun deduped elem ->     
       match deduped with
       | [] -> [ elem ]
       | x :: xs -> if equalityFn x elem 
                      then elem :: xs
                      else elem :: deduped
  <| []
```
Note however, that the outputted list will be in reverse order, due to list prepending. I hope this isn't a dealbreaker, since List.rev is a relatively expensive operation.

Test:
```
List.init 1000 (id) 
|> dedupeTakingLast2 (fun x y -> x - (x % 10) = y - (y % 10))
|> List.iter (printfn "%i ")

// 999 989 979 969 etc...
```
0 讨论(0)

查看其它9个回答
发布评论:

提交评论
- 加载中...