How do I avoid Clojure's chunking behavior for lazy seqs that I want to short circuit?

前端 未结 3 2026
眼角桃花
眼角桃花 2020-11-28 09:46

I have a long, lazy sequence that I want to reduce and test lazily. As soon as two sequential elements are not = (or some other predicate) to each other, I want

相关标签:
3条回答
  • 2020-11-28 09:56

    Looking in clojure.core at the definition of apply makes it obvious as to why it was chunking in groups of four when apply is used with an infinite sequence. Reduce doesn't short circuit, either...so I'm left writing my own solution:

    (defn reducep
      "Like reduce, but for use with a predicate. Short-circuits on first false."
      ([p coll]
         (if-let [s (seq coll)]
           (reducep p (first s) (next s))
           (p)))
      ([p val coll]
         (if-let [s (seq coll)]
           (if-let [v (p val (first s))]
             (recur p (first s) (next s))
             false)
           true)))
    

    Then using Stuart's unchunk (with an extra and)

    (defn unchunk [s]
      (lazy-seq
       (cons (first s)
             (and (next s)
                  (unchunk (next s))))))
    

    I get:

    (reducep = (map #(do (print %) %) (unchunk (range)))) ;; Prints 01, returns false
    (reducep = (map #(do (print %) %) (repeat 20 1))) ;; returns true
    (reducep = (map #(do (print %) %) (unchunk [0 0 2 4 5]))) ;; Returns false
    (reducep = (map #(do (print %) %) (unchunk [2 2 2 2 2]))) ;; returns true
    

    If that works for you too, mod this up.

    EDIT: Stuart's modified version of unchunk after his edit is probably preferable to the one in this earlier post.

    0 讨论(0)
  • 2020-11-28 10:10

    I found this post while hitting a time limit on a 4clojure problem and I found another way to avoid the 32-chunks:

    ;; add another dummy sequence parameter to the map:
    (apply = (map #(do (prn %2) %) (range) (range)))
    

    The higher arity forms of map don't appear to use chunked sequences (clojure 1.5)

    You have to do something with the second parameter, so being explicit about that might be better:

    (apply = (map (fn [i _] (prn i) i) (range) (range)))
    

    This isn't as neat as the other solutions but might be good for quick and dirty uses, like testing "is this slow because of chunking?".

    Regarding apply, you could use partition to get pairs from the sequence and = them:

    (every? #(apply = %) (partition 2 1
        (map (fn [i _] (prn i) i) (range) (range))))
    

    Although reducep looks useful too.

    PS. I don't want to give the impression that the chunked sequence is slower, it's not. My issue is that the 4clojure test case calls "first" on my seq generating function over a range of values, so chunking means I do 32 times the work. (PPS. My code is still too slow)

    0 讨论(0)
  • 2020-11-28 10:15

    CORRECTED TWICE: A simpler way to un-chunk a lazy sequence:

    (defn unchunk [s]
      (when (seq s)
        (lazy-seq
          (cons (first s)
                (unchunk (next s))))))
    

    First version omitted (when ... so it returned an infinite seq of nil's after the input sequence ended.

    Second version used first instead of seq so it stopped on nil.

    RE: your other question, "how do I avoid chunking when doing the apply, which seems to be consuming in chunked groups of four":

    This is due to the definition of =, which, when given a sequence of arguments, forces the first 4:

    (defn =
      ;; ... other arities ...
      ([x y & more]
       (if (= x y)
         (if (next more)
           (recur y (first more) (next more))
           (= y (first more)))
         false)))
    
    0 讨论(0)
提交回复
热议问题