问题
In clojure[script]
, how to write a function nearest
that receives two sorted vectors a
, b
and returns for each element of a
the nearest element of b
?
As an example,
(nearest [1 2 3 101 102 103] [0 100 1000]); [0 0 0 100 100 100]
I would like the solution to be both idiomatic and with good performances: O(n^2)
is not acceptable!
回答1:
Using a binary search or a sorted-set incurs a O(n*log m) time complexity where n is (count a)
and m (count b)
.
However leveraging the fact that a and b are sorted the time complexity can be O(max(n, m)).
(defn nearest [a b]
(if-let [more-b (next b)]
(let [[x y] b
m (/ (+ x y) 2)
[<=m >m] (split-with #(<= % m) a)]
(lazy-cat (repeat (count <=m) x)
(nearest >m more-b)))
(repeat (count a) (first b))))
=> (nearest [1 2 3 101 102 103 501 601] [0 100 1000])
(0 0 0 100 100 100 100 1000)
回答2:
Let n
be (count a)
and m
be (count b)
. Then, if a
and b
are both ordered, then this can be done in what I believe ought to be O(n log(log m))
time, in other words, very close to linear in n
.
First, let's re-implement abs
and a binary-search (improvements here) to be independent of host (leveraging a native, e.g. Java's, version ought to be significantly faster)
(defn abs [x]
(if (neg? x) (- 0 x) x))
(defn binary-search-nearest-index [v x]
(if (> (count v) 1)
(loop [lo 0 hi (dec (count v))]
(if (== hi (inc lo))
(min-key #(abs (- x (v %))) lo hi)
(let [m (quot (+ hi lo) 2)]
(case (compare (v m) x)
1 (recur lo m)
-1 (recur m hi)
0 m))))
0))
If b
is sorted, a binary search in b
takes log m
steps. So, mapping this over a
is a O(n log m)
solution, which for the pragmatist is likely good enough.
(defn nearest* [a b]
(map #(b (binary-search-nearest-index b %)) a))
However, we can also use the fact that a
is sorted to divide and conquer a
.
(defn nearest [a b]
(if (< (count a) 3)
(nearest* a b)
(let [m (quot (count a) 2)
i (binary-search-nearest-index b (a m))]
(lazy-cat
(nearest (subvec a 0 m) (subvec b 0 (inc i)))
[(b i)]
(nearest (subvec a (inc m)) (subvec b i))))))
I believe this ought to be O(n log(log m))
. We start with the median of a
and find nearest in b
in log m
time. Then we recurse on each half of a
with split portions of b
. If a m
-proportional factor of b are split each time, you have O(n log log m). If only a constant portion is split off then the half of a
working on that portion is linear time. If that continues (iterative halves of a work on constant size portions of b
) then you have O(n)
.
回答3:
Inspired by @amalloy, I have found this interesting idea by Chouser and wrote this solution:
(defn abs[x]
(max x (- x)))
(defn nearest-of-ss [ss x]
(let [greater (first (subseq ss >= x))
smaller (first (rsubseq ss <= x))]
(apply min-key #(abs (- % x)) (remove nil? [greater smaller]))))
(defn nearest[a b]
(map (partial nearest-of-ss (apply sorted-set a)) b))
Remark: It's important to create the sorted-set
only once, in order to avoid performance penalty!
来源:https://stackoverflow.com/questions/22205858/in-clojurescript-how-to-return-nearest-elements-between-2-sorted-vectors