vector vs map performance confusion

前端未结

关注

 2  1176

edit: I am specifically comparing std::vector\'s linear search operations to the std::map binary search operations because that i

相关标签:

2条回答

南笙

2020-12-30 23:55
I found the slides for easier reference (I can't see graphs, but I guess that might be because of proprietary file format). A relevant slide is number 39 which describes the problem that is being solved:

§ Generate N random integers and insert them into a sequence so that each is inserted in its proper position in the numerical order.

§ Remove elements one at a time by picking a random position in the sequence and removing the element there.

Now, it should be rather obvious that a linked list is not a good choice for this problem. Even though a list is much better than vector for inserting/removing in the beginning or in the middle, it's not good for inserting/removing in a random position because of the need for linear search. And linear search is much faster with vectors because of better cache efficiency.

Sutter suggests that a map (or a tree in general) would seem a natural choice for this algorithm because you get O(log n) search. And indeed, it does beat the vector quite easily for large N values in the insertion part.

Here comes the but. You need to remove the nth element (for random n). This is where I believe your code is cheating. You remove the elements in the order they were inserted, effectively using the input vector as a lookup table for finding value of an element at a "random" position so that you can search for it in O(log n). So you're really using a combination of set and a vector to solve the problem.

A regular binary search tree such as one used for std::map or std::set (which I assume Sutter used) doesn't have a fast algorithm for finding the nth element. Here's one which is claimed to be O(log n) on average and O(n) in the worst case. But std::map and std::set don't provide access to the underlying tree structure so for those you're stuck with in-order traversal (correct me if I'm wrong) which is a linear search again! I'm actually surprised that the map version is competitive with the vector one in Sutter's results.

For log(n) complexity, you need a structure such as Order statistic tree which is unfortunately not provided by standard library. There's GNU Policy-Based STL MAP as shown here.

Here is a quick test code I made for vector vs set vs ost (vs vector with binary search for good measure) https://ideone.com/DoqL4H Set is much slower while the other tree based structure is faster than the vector, which is not in line with Sutter's results.
```
order statistics tree: 15.958ms
vector binary search: 99.661ms
vector linear search: 282.032ms
set: 2004.04ms
```
(N = 20000, difference is only going to be greater in favor for the ost with bigger values)

In short, I came to same conclusion that Sutter's original results seem odd but for a slightly different reason. It seems to me that better asymptotic complexity wins lower constant factors this time.

Note that the problem description doesn't exclude the possibility of duplicate random values so using map / set instead of multimap / multiset is cheating a bit in the favor of the map / set, but I assume that to have only small significance when value domain is much larger than N. Also, pre-reserving the vector doesn't improve the performance significantly (around 1% when N = 20000).
0 讨论(0)
发布评论:

提交评论
- 加载中...
逝去的感伤

2020-12-31 00:19
It's of course difficult to give a precise answer in the absence of source code plus information about compiler options, hardware, etc.

A couple of possible differences:
- I think the speaker is talking about inserts/removals in the middle of the vector each time (whereas you always add to the end and remove from the beginning in your example);
- The speaker makes no mention of how to determine which items are added/removed, but in that case we may as well assume that the minimum effort is made to determine this: you make a vector access each time, whereas the insert values may well simply be calculated (e.g. use a low-cost PNRG to determine the next value to insert) or always be the same, and for removal, the middle element is removed each time so no value need be looked up/calculated.
However, as another commentator has mentioned, I would take away the general principle rather than the specific number/timings. Essentially, the take-away message is: what you thought you know about "counting operations" for the sake of assessing algorithm performance/scalability is no longer true on modern systems.
0 讨论(0)
发布评论:

提交评论
- 加载中...