F#: removing duplicates from a seq is slow

后端 未结 9 1082
半阙折子戏
半阙折子戏 2021-01-12 03:11

I am attempting to write a function that weeds out consecutive duplicates, as determined by a given equality function, from a seq<\'a> but with a twist:

9条回答
  •  北荒
    北荒 (楼主)
    2021-01-12 04:09

    Here is a pretty fast approach which uses library functions rather than Seq expressions.

    Your test runs in 0.007 seconds on my PC.

    It has a pretty nasty hack for the first element that doesn't work brilliantly that could be improved.

    let rec dedupe equalityfn prev (s:'a seq) : 'a seq =
        if Seq.isEmpty s then
            Seq.empty
        else
            let rest = Seq.skipWhile (equalityfn prev) s
            let valid = Seq.takeWhile (equalityfn prev) s
            let valid2 = if Seq.isEmpty valid  then Seq.singleton prev else (Seq.last valid) |> Seq.singleton
            let filtered = if Seq.isEmpty rest then Seq.empty else dedupe equalityfn (Seq.head rest) (rest)
            Seq.append valid2 filtered
    
    let t = [("a", 1); ("b", 2); ("b", 3); ("b", 4); ("c", 5)]
            |> dedupe (fun (x1, y1) (x2, y2) -> x1=x2) ("asdfasdf",1)
            |> List.ofSeq;;
    
    #time
    List.init 1000 (fun _ -> 1)
    |> dedupe (fun x y -> x = y) (189234784)
    |> List.ofSeq
    #time;;
    --> Timing now on
    
    Real: 00:00:00.007, CPU: 00:00:00.006, GC gen0: 0, gen1: 0
    val it : int list = [189234784; 1]
    
    --> Timing now off
    

提交回复
热议问题