I have implemented an algorithm to calculate the longest contiguous common subsequence (not to be confused with longest common subsequence, though not important for
EDIT: Added a faster uglier version below the first one.
Here is my take:
(defn my-lcs [^objects a1 ^objects a2]
(first
(let [n (inc (alength a1))]
(areduce a1 i
[max-len ^ints prev ^ints curr] [0 (int-array n) (int-array n)]
[(areduce a2 j max-len (unchecked-long max-len)
(let [match-len
(if (.equals (aget a1 i) (aget a2 j))
(unchecked-inc (aget prev j))
0)]
(aset curr (unchecked-inc j) match-len)
(if (> match-len max-len)
match-len
max-len)))
curr prev]))))
Main differences with yours: a[gs]et
vs a[gs]et-int
, use of unchecked-
ops (implicitly through areduce
), use of a vector as the return value (and "swap" mechanism) and max-len is coerced to primitive before the inner loop (primitive-valued loops are problematic, slightly less since 1.5RC2 but the support isn't perfect yet, however *warn-on-reflection*
is not silent).
And I switched to .equals
instead of =
to avoid the logic in Clojure's equiv.
EDIT: let's get ugly and restore the arrays swap trick:
(deftype F [^:unsynchronized-mutable ^ints curr
^:unsynchronized-mutable ^ints prev]
clojure.lang.IFn
(invoke [_ a1 a2]
(let [^objects a1 a1
^objects a2 a2]
(areduce a1 i max-len 0
(let [m (areduce a2 j max-len (unchecked-long max-len)
(let [match-len
(if (.equals (aget a1 i) (aget a2 j))
(unchecked-inc (aget prev j))
0)]
(aset curr (unchecked-inc j) (unchecked-int match-len))
(if (> match-len max-len)
match-len
max-len)))
bak curr]
(set! curr prev)
(set! prev bak)
m)))))
(defn my-lcs2 [^objects a1 a2]
(let [n (inc (alength a1))
f (F. (int-array n) (int-array n))]
(f a1 a2)))
On my box, it's 30% faster.
Here are a couple improvements:
Beyond that (and the long type hint on the recur mentioned above), I don't see any obvious ways to improve further.
(defn lcs
[^objects a1 ^objects a2]
(let [a1-len (alength a1)
a2-len (alength a2)
prev (int-array (inc a2-len))
curr (int-array (inc a2-len))]
(loop [i 0 max-len 0 prev prev curr curr]
(if (< i a1-len)
(recur (inc i)
(long (loop [j 0 max-len max-len]
(if (< j a2-len)
(if (= (aget a1 i) (aget a2 j))
(let [match-len (inc (aget prev j))]
(do
(aset curr (inc j) match-len)
(recur (inc j) (max max-len match-len))))
(do
(aset curr (inc j) 0)
(recur (inc j) max-len)))
max-len)))
curr
prev)
max-len))))
#'user/lcs
user> (time (lcs a1 a2))
"Elapsed time: 3862.211 msecs"