I have a sorted unique array and want to efficiently insert an element into it that is not in the array like this:
a = [1,2,4,5,6]
new_elm = 3
insert_at = a.bsearch_index {|x| x > new_elm } # => 2
a.insert(insert_at, new_elm) # now a = [1,2,3,4,5,6]
The method bsearch_index
doesn't exist: only bsearch
, which returns the matching element rather than the matching element's index. Is there any built in way to accomplish this?
You can use the Enumerator
object returned by each_with_index
to return a nested array of [value, index]
pairs and then conduct your binary search on that array:
a = [1,2,4,5,6]
new_elm = 3
index = [*a.each_with_index].bsearch{|x, _| x > new_elm}.last
=> 2
a.insert(index, new_elm)
EDIT:
I've run some simple benchmarks in response to your question with an array of length 1e6 - 1
:
require 'benchmark'
def binary_insert(a,e)
index = [*a.each_with_index].bsearch{|x, _| x > e}.last
a.insert(index, e)
end
a = *1..1e6
b = a.delete_at(1e5)
=> 100001
Benchmark.measure{binary_insert(a,b)}
=> #<Benchmark::Tms:0x007fd3883133d8 @label="", @real=0.37332, @cstime=0.0, @cutime=0.0, @stime=0.029999999999999805, @utime=0.240000000000002, @total=0.2700000000000018>
With this in mind, you might consider trying using a heap or a trie instead of an array to store your values. Heaps in particular have constant insertion and removal time complexities, making them ideal for large storage applications. Check out this article here: Ruby algorithms: sorting, trie, and heaps
How about using SortedSet
?:
require 'set'
a = SortedSet.new [1,2,4,5,6]
new_elm = 3
a << new_elm # now a = #<SortedSet: {1, 2, 3, 4, 5, 6}>
SortedSet is implemented using rbtree
. I've made the following benchmark:
def test_sorted(max_idx)
arr_1 = (0..max_idx).to_a
new_elm = arr_1.delete(arr_1.sample)
arr_2 = arr_1.dup
set_1 = SortedSet.new(arr_1)
Benchmark.bm do |x|
x.report { arr_1.insert(arr_1.index { |x| x > new_elm }) }
x.report { arr_2.insert([*arr_2.each_with_index].bsearch{|x, _| x > new_elm}.last) }
x.report { set_1 << new_elm }
end
end
With the following results:
test_sorted 10_000
# => user system total real
# => 0.000000 0.000000 0.000000 ( 0.000900)
# => 0.010000 0.000000 0.010000 ( 0.001868)
# => 0.000000 0.000000 0.000000 ( 0.000007)
test_sorted 100_000
# => user system total real
# => 0.000000 0.000000 0.000000 ( 0.001150)
# => 0.000000 0.010000 0.010000 ( 0.048040)
# => 0.000000 0.000000 0.000000 ( 0.000013)
test_sorted 1_000_000
# => user system total real
# => 0.040000 0.000000 0.040000 ( 0.062719)
# => 0.280000 0.000000 0.280000 ( 0.356032)
# => 0.000000 0.000000 0.000000 ( 0.000012)
"The method bsearch_index
doesn't exist": Ruby 2.3 introduces bsearch_index. (Kudos for getting the method name right before it existed).
Try this
(0...a.size).bsearch { |n| a[n] > new_element }
This uses bsearch
defined on Range
to search the array and thus returns the index.
Performance will be way better than each_with_index
which materializes O(n)
temporary array tuples and thus clogs up garbage collection.
Ruby 2.3.1 introduced bsearch_index thus the question can now be resolved this way:
a = [1,2,4,5,6]
new_elm = 3
index = a.bsearch_index{|x, _| x > new_elm}
=> 2
a.insert(index, new_elm)
The index
method accepts a block and will return the first index where the block is true
a = [1,2,4,5,6]
new_elem = 3
insert_at = a.index{|b| b > new_elem}
#=> 2
a.insert(insert_at, new_elm)
#=>[1,2,3,4,5,6]
来源:https://stackoverflow.com/questions/23481725/using-bsearch-to-find-index-for-inserting-new-element-into-sorted-array