If I try to do 1000 000 assoc!
on a transient vector, I\'ll get a vector of 1000 000 elements
(count
(let [m (transient [])]
(dotimes [i 1
The simplest explanation is from the Clojure documentation itself (emphasis mine):
Transients support a parallel set of 'changing' operations, with similar names followed by ! - assoc!, conj! etc. These do the same things as their persistent counterparts except the return values are themselves transient. Note in particular that transients are not designed to be bashed in-place. You must capture and use the return value in the next call.
The transient datatypes' operations don't guarantee that they will return the same reference as the one passed in. Sometimes the implementation might decide to return a new (but still transient) map after an assoc!
rather than using the one you passed in.
The ClojureDocs page on assoc! has a nice example that explains this behavior:
;; The key concept to understand here is that transients are
;; not meant to be `bashed in place`; always use the value
;; returned by either assoc! or other functions that operate
;; on transients.
(defn merge2
"An example implementation of `merge` using transients."
[x y]
(persistent! (reduce
(fn [res [k v]] (assoc! res k v))
(transient x)
y)))
;; Why always use the return value, and not the original? Because the return
;; value might be a different object than the original. The implementation
;; of Clojure transients in some cases changes the internal representation
;; of a transient collection (e.g. when it reaches a certain size). In such
;; cases, if you continue to try modifying the original object, the results
;; will be incorrect.
;; Think of transients like persistent collections in how you write code to
;; update them, except unlike persistent collections, the original collection
;; you passed in should be treated as having an undefined value. Only the return
;; value is predictable.
I'd like to repeat that last part because it's very important: the original collection you passed in should be treated as having an undefined value. Only the return value is predictable.
Here's a modified version of your code that works as expected:
(count
(let [m (transient {})]
(persistent!
(reduce (fn [acc i] (assoc! acc i i))
m (range 1000000)))))
As a side note, the reason you always get 8 is because Clojure likes to use a clojure.lang.PersistentArrayMap
(a map backed by an array) for maps with 8 or fewer elements. Once you get past 8, it switches to clojure.lang.PersistentHashMap
.
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a})
clojure.lang.PersistentArrayMap
user=> (type '{1 a 2 a 3 a 4 a 5 a 6 a 7 a 8 a 9 a})
clojure.lang.PersistentHashMap
Once you get past 8 entries, your transient map switches the backing data structure from an array of pairs (PersistentArrayMap
) to a hashtable (PersistentHashMap
), at which point assoc!
returns a new reference instead of just updating the old one.