I have a sorted sequence and want to go through it and return the unique entries in the sequence. I can do it using the following function, but it uses reference variables a
Seq.distinct (1::[1..5])
returns seq [1; 2; 3; 4; 5]
. Is that what you meant?
distinct
and distinctBy
both use Dictionary
and therefore require hashing and a bit of memory for storing unique items. If your sequence is already sorted, you can use the following approach (similar to yours). It's nearly twice as fast and has constant memory use, making it usable for sequences of any size.
let distinctWithoutHash (items:seq<_>) =
seq {
use e = items.GetEnumerator()
if e.MoveNext() then
let prev = ref e.Current
yield !prev
while e.MoveNext() do
if e.Current <> !prev then
yield e.Current
prev := e.Current
}
let items = Seq.init 1000000 (fun i -> i / 2)
let test f = items |> f |> (Seq.length >> printfn "%d")
test Seq.distinct //Real: 00:00:01.038, CPU: 00:00:01.435, GC gen0: 47, gen1: 1, gen2: 1
test distinctWithoutHash //Real: 00:00:00.622, CPU: 00:00:00.624, GC gen0: 44, gen1: 0, gen2: 0
I couldn't figure out a way to use mutable
s instead of ref
s (short of hand-coding an enumerator), which I'm sure would speed it up considerably (I tried it--it makes no difference).
[1;1;1;2;2;2;3;3;3]
|> Seq.distinctBy id
|> printfn "%A"
In my case I could not use Seq.distinct because I needed to preserve order of list elements. I used solution from http://ocaml.org/learn/tutorials/99problems.html. I think it is quite short
let rec compress = function
| a :: (b :: _ as t) -> if a = b then compress t else a :: compress t
| smaller -> smaller
Just initialize a unique collection (like a set) with the sequence like this:
set [1; 2; 3; 3; 4; 5; 5];;
=> val it : Set<int> = set [1; 2; 3; 4; 5]
The solution below, preserves order of elements and returns only the first occurance of an element in a generic list. Of course this generates a new List with the redundant items removed.
// **** Returns a list having subsequent redundant elements removed
let removeDuplicates(lst : 'a list) =
let f item acc =
match acc with
| [] -> [item]
| _ ->
match List.exists(fun x -> x = item) acc with
| false -> item :: acc
| true -> acc
lst
|> List.rev
|> fun x -> List.foldBack f x []
|> List.rev
// **** END OF FUNCTION removeDuplicates
val removeDuplicates : 'a list -> 'a list when 'a : equality
val testList : int list = [1; 4; 3; 1; 2; 2; 1; 1; 3; 4; 3]
val tryAbove : int list = [1; 4; 3; 2]