Difference between two maps

前端 未结 7 1487
悲&欢浪女
悲&欢浪女 2021-01-01 17:22

I need to very efficiently compare two maps in Clojure/Java, and return the difference as determined by Java\'s .equals(..), with nil/null equivalent to \"not present\".

相关标签:
7条回答
  • 2021-01-01 17:40

    You could also just use Maps.difference(..) method from Google's Guava libraries

    0 讨论(0)
  • 2021-01-01 17:41

    I'm not sure what the absolutely most efficient way to do this is, but here's a couple of things which may be useful:

    1. The basic expectation of behaviour from the question text is impossible: if a and b are maps such that b contains at least one key not present in a, (merge b <sth>) cannot be equal to a.

    2. If you end up going with an interop solution but then need to go back to a PersistentHashMap at some point, there's always

      (clojure.lang.PersistentHashMap/create
       (doto (java.util.HashMap.)
         (.put :foo 1)
         (.put :bar 2)))
      ; => {:foo 1 :bar 2}
      
    3. If you need to pass the keyset of a Clojure map to a Java method, you can use

      (.keySet {:foo 1 :bar 2})
      ; => #< [:foo, :bar]>
      
    4. If all keys involved are guaranteed to be Comparable, this could be exploited for efficient computation of difference on maps with many keys (sort & merge scan). For unconstrained keys this is of course a no-go and for small maps it could actually hurt performance.

    5. It's good to have a version written in Clojure, if only to set a baseline performance expectation. Here is one: (updated)

      (defn map-difference [m1 m2]
              (loop [m (transient {})
                     ks (concat (keys m1) (keys m2))]
                (if-let [k (first ks)]
                  (let [e1 (find m1 k)
                        e2 (find m2 k)]
                    (cond (and e1 e2 (not= (e1 1) (e2 1))) (recur (assoc! m k (e1 1)) (next ks))
                          (not e1) (recur (assoc! m k (e2 1)) (next ks))
                          (not e2) (recur (assoc! m k (e1 1)) (next ks))
                          :else    (recur m (next ks))))
                  (persistent! m))))
      

      I think that just doing (concat (keys m1) (keys m2)) and possibly duplicating some work is likely more efficient most of the time than checking a given key is in "the other map" too at every step.

    To wrap up the answer, here's a very simple-minded set-based version with the nice property that it says what it does -- if I misunderstood the spec, it should be readily apparent here. :-)

    (defn map-difference [m1 m2]
      (let [ks1 (set (keys m1))
            ks2 (set (keys m2))
            ks1-ks2 (set/difference ks1 ks2)
            ks2-ks1 (set/difference ks2 ks1)
            ks1*ks2 (set/intersection ks1 ks2)]
        (merge (select-keys m1 ks1-ks2)
               (select-keys m2 ks2-ks1)
               (select-keys m1
                            (remove (fn [k] (= (m1 k) (m2 k)))
                                    ks1*ks2)))))
    
    0 讨论(0)
  • 2021-01-01 17:45

    I am not sure about its performance

    (defn map-difference
      [orig other]
      (let [changed (set/difference (set orig) (set other))
            added (set/difference (set (keys other)) (set (keys orig)))]
        (reduce (fn [acc key]
                  (assoc acc key :missing))
          (into {} changed)
          added)))
    

    I used :missing key to avoid ambiguity between a nil value in the original map, and a missing key -- you can of course change it to nil.

    0 讨论(0)
  • 2021-01-01 17:47

    What about...

    (defn map-diff [m1 m2]
      ;; m1: hashmap
      ;; m2: hashmap
      ;; => the difference between them
      (reduce merge
              (map #(hash-map % (- (or (% m1) 0) (or (% m2) 0)))
                   (keys (merge m1 m2)))))
    
    0 讨论(0)
  • 2021-01-01 17:54
    1. Clojure maps will be fine because reading clojure maps is very fast.

    2. I can't answer you but I can give you something to look at. Brenton Ashworth made a testtool where he solved the problem with map compares. Maybe you can look at his code to get hint for implementation. http://formpluslogic.blogspot.com/2010/07/better-clojure-test-results-with-deview.html and http://github.com/brentonashworth/deview

    3. I don't think there is a better implementation that compare the keys and look up if the are different.

    0 讨论(0)
  • 2021-01-01 17:55

    In Java, Google Commons Collections offer a quite performant solution.

    0 讨论(0)
提交回复
热议问题