set vs unordered_set for fastest iteration

前端 未结 5 1531
无人及你
无人及你 2021-02-13 02:18

In my application I have the following requirements -

  1. The data structure will be populated just once with some values (not key/value pairs). The values may be r

5条回答
  •  我在风中等你
    2021-02-13 02:53

    There are several approaches.

    1. The comments to your question suggest keeping a std::unordered_set that has the fastest O(1) lookup/insertion and O(N) iteration (as has every container). If you have data that changes a lot, or requires a lot of random lookups, this is probably the fastest. But test.
    2. If you need to iterate 100s of times without intermediate insertions, you can do a single O(N) copy to a std::vector and gain from contiguous memory layout 100s of times. Test whether this is faster than a regular std::unordered_set.
    3. If you have a small number of intermediate insertions between iterations, it could pay to use a dedicated vector. If you can use Boost.Container, try boost::flat_set which offers a std::set interface with a std::vector storage back-end (i.e. a contiguous memory layout that is very cache- and prefetch friendly). Again, test whether this gives a speed-up to the other two solutions.

    For the last solution, see the Boost documentation for some of the tradeoffs (it's good to be aware of all the other issues like iterator invalidation, move semantics and exception safety as well):

    Boost.Container flat_[multi]map/set containers are ordered-vector based associative containers based on Austern's and Alexandrescu's guidelines. These ordered vector containers have also benefited recently with the addition of move semantics to C++, speeding up insertion and erasure times considerably. Flat associative containers have the following attributes:

    • Faster lookup than standard associative containers
    • Much faster iteration than standard associative containers
    • Less memory consumption for small objects (and for big objects if shrink_to_fit is used)
    • Improved cache performance (data is stored in contiguous memory)
    • Non-stable iterators (iterators are invalidated when inserting and erasing elements)
    • Non-copyable and non-movable values types can't be stored
    • Weaker exception safety than standard associative containers (copy/move constructors can throw when shifting values in erasures and insertions)
    • Slower insertion and erasure than standard associative containers (specially for non-movable types)

    NOTE: with faster lookup, it is meant that a flat_set does O(log N) on contiguous memory rather than O(log N) pointer chasing of a regular std::set. Of course, a std::unordered_set does O(1) lookup, which will faster for large N.

提交回复
热议问题