mapcat breaking the lazyness

前端 未结 2 1142
孤独总比滥情好
孤独总比滥情好 2021-01-04 13:53

I have a function that produces lazy-sequences called a-function.

If I run the code:

(map a-function a-sequence-of-values) 

it retu

相关标签:
2条回答
  • 2021-01-04 14:21

    Your premise is wrong. Concat is lazy, apply is lazy if its first argument is, and mapcat is lazy.

    user> (class (mapcat (fn [x y] (println x y) (list x y)) (range) (range)))
    0 0
    1 1
    2 2
    3 3
    clojure.lang.LazySeq
    

    note that some of the initial values are evaluated (more on this below), but clearly the whole thing is still lazy (or the call would never have returned, (range) returns an endless sequence, and will not return when used eagerly).

    The blog you link to is about the danger of recursively using mapcat on a lazy tree, because it is eager on the first few elements (which can add up in a recursive application).

    0 讨论(0)
  • 2021-01-04 14:36

    Lazy-sequence production and consumption is different than lazy evaluation.

    Clojure functions do strict/eager evaluation of their arguments. Evaluation of an argument that is or that yields a lazy sequence does not force realization of the yielded lazy sequence in and of itself. However, any side effects caused by evaluation of the argument will occur.

    The ordinary use case for mapcat is to concatenate sequences yielded without side effects. Therefore, it hardly matters that some of the arguments are eagerly evaluated because no side effects are expected.

    Your function my-mapcat imposes additional laziness on the evaluation of its arguments by wrapping them in thunks (other lazy-seqs). This can be useful when significant side effects - IO, significant memory consumption, state updates - are expected. However, the warning bells should probably be going off in your head if your function is doing side effects and producing a sequence to be concatenated that your code probably needs refactoring.

    Here is similar from algo.monads

    (defn- flatten*
      "Like #(apply concat %), but fully lazy: it evaluates each sublist
       only when it is needed."
      [ss]
      (lazy-seq
        (when-let [s (seq ss)]
          (concat (first s) (flatten* (rest s))))))
    

    Another way to write my-mapcat:

    (defn my-mapcat [f coll] (for [x coll, fx (f x)] fx))
    

    Applying a function to a lazy sequence will force realization of a portion of that lazy sequence necessary to satisfy the arguments of the function. If that function itself produces lazy sequences as a result, those are not realized as a matter of course.

    Consider this function to count the realized portion of a sequence

    (defn count-realized [s] 
      (loop [s s, n 0] 
        (if (instance? clojure.lang.IPending s)
          (if (and (realized? s) (seq s))
            (recur (rest s) (inc n))
            n)
          (if (seq s)
            (recur (rest s) (inc n))
            n))))
    

    Now let's see what's being realized

    (let [seq-of-seqs (map range (list 1 2 3 4 5 6))
          concat-seq (apply concat seq-of-seqs)]
      (println "seq-of-seqs: " (count-realized seq-of-seqs))
      (println "concat-seq: " (count-realized concat-seq))
      (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))          
    
     ;=> seq-of-seqs:  4
     ;   concat-seq:  0
     ;   seqs-in-seq:  [0 0 0 0 0 0]
    

    So, 4 elements of the seq-of-seqs got realized, but none of its component sequences were realized nor was there any realization in the concatenated sequence.

    Why 4? Because the applicable arity overloaded version of concat takes 4 arguments [x y & xs] (count the &).

    Compare to

    (let [seq-of-seqs (map range (list 1 2 3 4 5 6))
          foo-seq (apply (fn foo [& more] more) seq-of-seqs)]
      (println "seq-of-seqs: " (count-realized seq-of-seqs))
      (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))
    
    ;=> seq-of-seqs:  2
    ;   seqs-in-seq:  [0 0 0 0 0 0]
    
    (let [seq-of-seqs (map range (list 1 2 3 4 5 6))
          foo-seq (apply (fn foo [a b c & more] more) seq-of-seqs)]
      (println "seq-of-seqs: " (count-realized seq-of-seqs))
      (println "seqs-in-seq: " (mapv count-realized seq-of-seqs)))
    
    ;=> seq-of-seqs:  5
    ;   seqs-in-seq:  [0 0 0 0 0 0]
    

    Clojure has two solutions to making the evaluation of arguments lazy.

    One is macros. Unlike functions, macros do not evaluate their arguments.

    Here's a function with a side effect

    (defn f [n] (println "foo!") (repeat n n))
    

    Side effects are produced even though the sequence is not realized

    user=> (def x (concat (f 1) (f 2)))
    foo!
    foo!
    #'user/x
    user=> (count-realized x)
    0
    

    Clojure has a lazy-cat macro to prevent this

    user=> (def y (lazy-cat (f 1) (f 2)))
    #'user/y
    user=> (count-realized y)
    0
    user=> (dorun y)
    foo!
    foo!
    nil
    user=> (count-realized y)
    3
    user=> y
    (1 2 2)
    

    Unfortunately, you cannot apply a macro.

    The other solution to delay evaluation is wrap in thunks, which is exactly what you've done.

    0 讨论(0)
提交回复
热议问题