As part of a bigger problem of enumerating a set, I need to write an OCaml function 'choose' which takes a list and outputs as the list of all possible sequences of size k made up of elements of that list (without repeating sequences which can be obtained from each other by permutation). The order they are put in the end list is not relevant.
For example,
choose 2 [1;2;3;4] = [[1;2];[1;3];[1;4];[2;3];[2;4];[3;4]]
Any ideas?
I would like to have the whole thing to be lazy, outputting a lazy list, but if you have a strict solution, that'll be very useful too.
Here is a strict and suboptimal version. I hope it is clear. It avoids duplicates by assuming there are no duplicates in the input list, and by generating only sublists that are in the same order as in the original list.
The length computation could be factored by passing l
's length as an argument of choose
. That would make the code less readable but more efficient.
For the lazy version, sprinkle "lazy" and "Lazy.force" on the code...
let rec choose k l =
if k = 0
then [ [] ]
else
let len = List.length l in
if len < k
then []
else if k = len
then [ l ]
else
match l with
h :: t ->
let starting_with_h =
(List.map (fun sublist -> h :: sublist) (choose (pred k) t))
in
let not_starting_with_h = choose k t in
starting_with_h @ not_starting_with_h
| [] -> assert false
;;
val choose : int -> 'a list -> 'a list list = <fun>
# choose 3 [1; 2; 3; 4; 5; 6; 7] ;;
- : int list list =
[[1; 2; 3]; [1; 2; 4]; [1; 2; 5]; [1; 2; 6]; [1; 2; 7]; [1; 3; 4]; [1; 3; 5];
[1; 3; 6]; [1; 3; 7]; [1; 4; 5]; [1; 4; 6]; [1; 4; 7]; [1; 5; 6]; [1; 5; 7];
[1; 6; 7]; [2; 3; 4]; [2; 3; 5]; [2; 3; 6]; [2; 3; 7]; [2; 4; 5]; [2; 4; 6];
[2; 4; 7]; [2; 5; 6]; [2; 5; 7]; [2; 6; 7]; [3; 4; 5]; [3; 4; 6]; [3; 4; 7];
[3; 5; 6]; [3; 5; 7]; [3; 6; 7]; [4; 5; 6]; [4; 5; 7]; [4; 6; 7]; [5; 6; 7]]
EDIT:
A lazy_list_append
as appears necessary from the comments below:
type 'a node_t =
| Empty
| Node of 'a * 'a zlist_t
and 'a zlist_t = 'a node_t lazy_t
let rec lazy_list_append l1 l2 =
lazy
(match Lazy.force l1 with
Empty -> Lazy.force l2
| Node (h, lt) ->
Node (h, lazy_list_append lt l2))
;;
Plugging in again with a Haskell solution (it's just easier to work with lazy lists since they are built-in):
combinations 0 _ = [[]]
combinations k [] = []
combinations k (x:xs) = map (x:) (combinations (k-1) xs) ++ combinations k xs
The first two cases follow from the properties of binomial coefficients and more specifically: n choose 0 = 1
for all n
including n=0
(that's why it is first to handle the case 0 choose 0
). The other one is 0 choose k = 0
. The third equation is exact translation of the recursive definition of combinations.
Unfortunately when you apply it to an infinite list it returns a trivial solution:
> take 10 $ combinations 3 [1..]
[[1,2,3],[1,2,4],[1,2,5],[1,2,6],[1,2,7],[1,2,8],[1,2,9],[1,2,10],[1,2,11],[1,2,12]]
EDIT:
OK, so we really want to go trough each combination in a finite number of steps. With the above version we are obviously using only the expression to the left of ++
which generates only combinations starting with 1. We can work around this problem by defining an interesting list zipping function which builds a list by alternately picking the head of each of its argument lists (it's important to be non-strict in the second argument):
merge [] ys = ys
merge (x:xs) ys = x:merge ys xs
and use it instead of ++
:
combinations k (x:xs) = map (x:) (combinations (k-1) xs) `merge` combinations k xs
lets see:
> let comb_10_3 = combinations 3 [1..10]
> let comb_inf_3 = combinations 3 [1..]
> take 10 comb_inf_3
[[1,2,3],[2,3,4],[1,3,4],[3,4,5],[1,2,4],[2,4,5],[1,4,5],[4,5,6],[1,2,5],[2,3,5]]
> comb_10_3 `intersect` comb_inf_3 == comb_10_3
True
> last $ combinations 3 [1..10]
[6,8,10]
> elemIndex [6,8,10] $ combinations 3 [1..]
Just 351
All 10 choose 3
combinations are there!
Just for the sake of completeness, I am putting here the final code which brings together the strict code from Pascal with my lazy stuff and all other Pascal's useful comments.
The lazy list type is defined, then two auxiliary lazy functions (append and map), and finally the function "choose" that we aim to define.
type 'a node_t =
| Nil
| Cons of 'a * 'a t
and 'a t = ('a node_t) Lazy.t
let rec append l1 l2 =
match Lazy.force l1 with
| Nil -> l2
| Cons (a, l) -> lazy (Cons (a, append l l2))
let rec map f ll = lazy (
match Lazy.force ll with
| Nil -> Nil
| Cons(h,t) -> Cons(f h, map f t) )
let rec choose k l len =
if k = 0
then lazy (Cons(lazy Nil,lazy Nil))
else
if len < k
then lazy Nil
else if k = len
then lazy (Cons (l,lazy Nil))
else
match Lazy.force l with
| Cons(h,t) -> let g h sublist = lazy (Cons (h,sublist))
in let starting_with_h = (map (g h) (choose (k-1) t (len-1)))
in let not_starting_with_h = choose k t (len-1)
in append starting_with_h not_starting_with_h
| Nil -> assert false
The result of evaluating "choose k ls n" is a lazy list of all choices of k elements of list ls, with ls considered up to size n. Note that, as pointed out by Pascal, because of the way the enumeration takes place, the function choose will not cover all choices of an infinite list.
Thanks, this was really useful!
Best, Surikator.
来源:https://stackoverflow.com/questions/3969321/lazy-n-choose-k-in-ocaml