Splitting a list into list of lists based on predicate

前端 未结 5 2006
感动是毒
感动是毒 2021-01-21 03:28

(I am aware of this question, but it relates to sequences, which is not my problem here)

Given this input (for example):

let testlist = 
    [  
       \         


        
相关标签:
5条回答
  • 2021-01-21 03:53

    Edit: rev-less version using foldBack added below.

    Here's some code that uses lists and tail-recursion:

    //divides a list L into chunks for which all elements match pred
    let divide pred L =
        let rec aux buf acc L =
            match L,buf with
            //no more input and an empty buffer -> return acc
            | [],[] -> List.rev acc 
            //no more input and a non-empty buffer -> return acc + rest of buffer
            | [],buf -> List.rev (List.rev buf :: acc) 
            //found something that matches pred: put it in the buffer and go to next in list
            | h::t,buf when pred h -> aux (h::buf) acc t
            //found something that doesn't match pred. Continue but don't add an empty buffer to acc
            | h::t,[] -> aux [] acc t
            //found input that doesn't match pred. Add buffer to acc and continue with an empty buffer
            | h::t,buf -> aux [] (List.rev buf :: acc) t
        aux [] [] L
    

    usage:

    > divide pred testlist;;
    val it : string list list =
      [["*text1"; "*text2"]; ["*text5"; "*text6"; "*text7"]]
    

    Using a list as data structure for a buffer means that it always needs to be reversed when outputting the contents. This may not be a problem if individual chunks are modestly sized. If speed/efficiency becomes an issue, you could use a Queue<'a> or a `List<'a>' for the buffers, for which appending is fast. But using these data structures instead of lists also means that you lose the powerful list pattern matching. In my opinion, being able to pattern match lists outweighs the presence of a few List.rev calls.

    Here's a streaming version that outputs the result one block at a time. This avoids the List.rev on the accumulator in the previous example:

    let dividestream pred L =
        let rec aux buf L =
            seq { match L, buf with
                  | [],[] -> ()
                  | [],buf -> yield List.rev buf
                  | h::t,buf when pred h -> yield! aux (h::buf) t
                  | h::t,[] -> yield! aux [] t
                  | h::t,buf -> yield List.rev buf
                                yield! aux [] t }
        aux [] L
    

    This streaming version avoids the List.rev on the accumulator. Using List.foldBack can be used to avoid reversing the accumulated chunks as well.

    update: here's a version using foldBack

    //divides a list L into chunks for which all elements match pred
    let divide2 pred L =
        let f x (acc,buf) =
            match pred x,buf with
            | true,buf -> (acc,x::buf)
            | false,[] -> (acc,[])
            | false,buf -> (buf::acc,[])
    
        let rest,remainingBuffer = List.foldBack f L ([],[])
        match remainingBuffer with
        | [] -> rest
        | buf -> buf :: rest
    
    0 讨论(0)
  • 2021-01-21 03:58

    Another version of shunt:

    let shunt pred lst =
        let rec tWhile pred lst = 
            match lst with
            | []                    -> [], []
            | hd :: tl when pred hd -> let taken, rest = tWhile pred tl
                                       (hd :: taken), rest
            | lst                   -> [], lst
        let rec collect = function
            | []  -> []
            | lst -> let taken, rest = tWhile pred lst
                     taken :: (collect (snd (tWhile (fun x -> not (pred x)) rest)))
        collect lst
    

    This one avoids List.rev but it's not tail recursive - so only suitable for small lists.

    0 讨论(0)
  • 2021-01-21 04:09

    Just reverse the list once up front, and then build the structure in order easily:

    let Shunt p l =
        let mutable r = List.rev l
        let mutable result = []
        while not r.IsEmpty do
            let mutable thisBatch = []
            while not r.IsEmpty && not(p r.Head) do
                r <- r.Tail 
            while not r.IsEmpty && p r.Head do
                thisBatch <- r.Head :: thisBatch
                r <- r.Tail
            if not thisBatch.IsEmpty then
                result <- thisBatch :: result
        result        
    

    The outer while deals with each 'batch', and the first inner while skips over any that don't match the predicate, followed by another while that grabs all those that do and stores them in the current batch. If there was anything in this batch (the final one may be empty), prepend it to the final result.

    This is an example where I think locally imperative code is simply superior to a purely functional counterpart. The code above is so easy to write and to reason about.

    0 讨论(0)
  • 2021-01-21 04:09

    yet another one...

    let partition pred lst = 
        let rec trec xs cont =
            match xs with
            | []               -> ([],[]) |> cont
            | h::t when pred h -> (fun (y,n) -> h::y,n) >> cont |> trec t
            | h::t             -> (fun (y,n) -> y,h::n) >> cont |> trec t
        trec lst id
    

    then we can define shunt:

    let shunt pred lst = lst |> partition pred |> (fun (x,y) -> [x;y])
    
    0 讨论(0)
  • 2021-01-21 04:12

    there is a List.partition function in the F# core library (in case you wanted to implement this just to have it working and not to learn how to write recursive functions yourself). Using this function, you can write this:

    > testlist |> List.partition (fun s -> s.StartsWith("*"))
    val it : string list * string list =
      (["*text1"; "*text2"; "*text5"; "*text6"; "*text7"], ["text3"; "text4"])
    

    Note that this function returns a tuple instead of returning a list of lists. This is a bit different to what you wanted, but if the predicate returns just true or false, then this makes more sense.

    The implementation of partition function that returns tuples is also a bit simpler, so it may be useful for learning purposes:

    let partition pred list = 
      // Helper function, which keeps results collected so
      // far in 'accumulator' arguments outTrue and outFalse
      let rec partitionAux list outTrue outFalse =
        match list with 
        | [] -> 
            // We need to reverse the results (as we collected
            // them in the opposite order!)
            List.rev outTrue, List.rev outFalse
        // Append element to one of the lists, depending on 'pred'
        | x::xs when pred x -> partitionAux xs (x::outTrue) outFalse
        | x::xs -> partitionAux xs outTrue (x::outFalse)
    
      // Run the helper function
      partitionAux list [] []   
    
    0 讨论(0)
提交回复
热议问题