How can I remove duplicates in an F# sequence without using references

前端 未结 6 406
南方客
南方客 2021-01-20 09:07

I have a sorted sequence and want to go through it and return the unique entries in the sequence. I can do it using the following function, but it uses reference variables a

相关标签:
6条回答
  • 2021-01-20 09:41

    Seq.distinct (1::[1..5]) returns seq [1; 2; 3; 4; 5]. Is that what you meant?

    0 讨论(0)
  • 2021-01-20 09:42

    distinct and distinctBy both use Dictionary and therefore require hashing and a bit of memory for storing unique items. If your sequence is already sorted, you can use the following approach (similar to yours). It's nearly twice as fast and has constant memory use, making it usable for sequences of any size.

    let distinctWithoutHash (items:seq<_>) =
      seq {
        use e = items.GetEnumerator()
        if e.MoveNext() then
          let prev = ref e.Current
          yield !prev
          while e.MoveNext() do
            if e.Current <> !prev then 
              yield e.Current
              prev := e.Current
      }
    
    let items = Seq.init 1000000 (fun i -> i / 2)
    let test f = items |> f |> (Seq.length >> printfn "%d")
    
    test Seq.distinct        //Real: 00:00:01.038, CPU: 00:00:01.435, GC gen0: 47, gen1: 1, gen2: 1
    test distinctWithoutHash //Real: 00:00:00.622, CPU: 00:00:00.624, GC gen0: 44, gen1: 0, gen2: 0
    

    I couldn't figure out a way to use mutables instead of refs (short of hand-coding an enumerator), which I'm sure would speed it up considerably (I tried it--it makes no difference).

    0 讨论(0)
  • 2021-01-20 09:43
    [1;1;1;2;2;2;3;3;3]
    |> Seq.distinctBy id
    |> printfn "%A"
    
    0 讨论(0)
  • 2021-01-20 09:52

    In my case I could not use Seq.distinct because I needed to preserve order of list elements. I used solution from http://ocaml.org/learn/tutorials/99problems.html. I think it is quite short

    let rec compress = function
        | a :: (b :: _ as t) -> if a = b then compress t else a :: compress t
        | smaller -> smaller
    
    0 讨论(0)
  • 2021-01-20 09:54

    Just initialize a unique collection (like a set) with the sequence like this:

    set [1; 2; 3; 3; 4; 5; 5];;
    => val it : Set<int> = set [1; 2; 3; 4; 5]
    
    0 讨论(0)
  • 2021-01-20 09:54

    The solution below, preserves order of elements and returns only the first occurance of an element in a generic list. Of course this generates a new List with the redundant items removed.

    //  ****  Returns a list having subsequent redundant elements removed
    let removeDuplicates(lst : 'a list) = 
        let f item acc =
            match acc with 
            | [] -> [item]
            | _ ->
                match List.exists(fun x -> x = item) acc with
                | false -> item :: acc
                | true -> acc
        lst 
        |> List.rev
        |> fun x -> List.foldBack f x []
        |> List.rev
    //  **** END OF FUNCTION removeDuplicates
    
    val removeDuplicates : 'a list -> 'a list when 'a : equality
    val testList : int list = [1; 4; 3; 1; 2; 2; 1; 1; 3; 4; 3]
    val tryAbove : int list = [1; 4; 3; 2]
    
    0 讨论(0)
提交回复
热议问题