Here’s a problem I’ve really been struggling with. I need to merge two sorted sequences into a single sorted sequence. Ideally, the algorithm should be lazy-evaluated, and
Use the LazyList type in the PowerPack. I think I maybe even have this exact code lying around, let me look...
EDIT
not exactly it, but close: http://cs.hubfs.net/forums/thread/8136.aspx
Ideally, the algorithm should be lazy-evaluate... the creation of a subsequence for every item is a performance killer
Lazy means slow but here is a solution using lazy lists:
let (++) = LazyList.consDelayed
let rec merge xs ys () =
match xs, ys with
| Cons(x, xs'), Cons(y, _) when x<y -> x ++ merge xs' ys
| Cons(x, _), Cons(y, ys') -> y ++ merge xs ys'
| Nil, xs | xs, Nil -> xs
I think by "lazy evaluated" you mean you want the merged result to be generated on demand so you can also use:
let rec merge xs ys = seq {
match xs, ys with
| x::xs, y::_ when x<y ->
yield x
yield! merge xs ys
| x::_, y::ys ->
yield y
yield! merge xs ys
| [], xs | xs, [] -> yield! xs
}
As you say, this is very inefficient. However, a seq
-based solution doesn't have to be slow. Here, Seq.unfold
is your friend and can make this over 4× faster by my measurements:
let merge xs ys =
let rec gen = function
| x::xs, (y::_ as ys) when x<y -> Some(x, (xs, ys))
| xs, y::ys -> Some(y, (xs, ys))
| [], x::xs | x::xs, [] -> Some(x, ([], xs))
| [], [] | [], [] -> None
Seq.unfold gen (xs, ys)
Sequences don't really pattern match well.
Fortunately one of the advantages of F# is being able to drop down to imperative code when you need to, and I think it still idiomatic to use mutable state internally so long as the function is still pure to clients consuming the function. I think this style is really common in the F# source code wherever sequences are involved.
Its not pretty, but this works:
open System.Collections.Generic
let merge (a : #seq<'a>) (b : #seq<'a>) =
seq {
use a = a.GetEnumerator()
use b = b.GetEnumerator()
let aNext = ref <| a.MoveNext()
let bNext = ref <| b.MoveNext()
let inc (enumerator : IEnumerator<'a>) flag = // '
let temp = enumerator.Current
flag := enumerator.MoveNext()
temp
let incA() = inc a aNext
let incB() = inc b bNext
while !aNext || !bNext do
match !aNext, !bNext with
| true, true ->
if a.Current > b.Current then yield incB()
elif a.Current < b.Current then yield incA()
else yield incA(); yield incB()
| true, false -> yield incA()
| false, true -> yield incB()
| false, false -> ()
}