Concurrent data structure design

前端 未结 11 629
名媛妹妹
名媛妹妹 2021-01-30 18:44

I am trying to come up with the best data structure for use in a high throughput C++ server. The data structure will be used to store anything from a few to several million obje

11条回答
  •  一生所求
    2021-01-30 19:10

    Personally, I'm quite fond of persistent immutable data structures in highly-concurrent situations. I don't know of any specifically for C++, but Rich Hickey has created some excellent (and blisteringly fast) immutable data structures in Java for Clojure. Specifically: vector, hashtable and hashset. They aren't too hard to port, so you may want to consider one of those.

    To elaborate a bit more, persistent immutable data structures really solve a lot of problems associated with concurrency. Because the data structure itself is immutable, there isn't a problem with multiple threads reading/iterating concurrently (so long as it is a const iterator). "Writing" can also be asynchronous because it's not really writing to the existing structure but rather creating a new version of that structure which includes the new element. This operation is made efficient (O(1) in all of Hickey's structures) by the fact that you aren't actually copying everything. Each new version shares most of its structure with the old version. This makes things more memory efficient, as well as dramatically improving performance over the simple copy-on-write technique.

    With immutable data structures, the only time where you actually need to synchronize is in actually writing to a reference cell. Since memory access is atomic, even this can usually be lock-free. The only caveat here is you might lose data between threads (race conditions). The data structure will never be corrupted due to concurrency, but that doesn't mean that inconsistent results are impossible in situations where two threads create a new version of the structure based on a single old and attempt to write their results (one of them will "win" and the other's changes will be lost). To solve this problem, you either need to have a lock for "writing operations", or use some sort of STM. I like the second approach for ease-of-use and throughput in low-collision systems (writes are ideally non-blocking and reads never block), but either one will work.

    You've asked a tough question, one for which there isn't really a good answer. Concurrency-safe data structures are hard to write, particularly when they need to be mutable. Completely lock-free architectures are provably impossible in the presence of shared state, so you may want to give up on that requirement. The best you can do is minimize the locking required, hence the immutable data structures.

提交回复
热议问题