Initializing a std::map when the size is known in advance

后端 未结 5 1733
再見小時候
再見小時候 2020-12-30 19:00

I would like to initialize a std::map. For now I am using ::insert but I feel I am wasting some computational time since I already know the size I

相关标签:
5条回答
  • 2020-12-30 19:32

    Not sure if this answers your question, but Boost.Container has a flat_map in which you can reserve space. Basically you can see this as a sorted vector of (key, value) pairs. Tip: if you also know that your input is sorted, you can use insert with hint for maximal performance.

    0 讨论(0)
  • 2020-12-30 19:33

    You are talking about block allocators. But it is hard to implement. Measure before think about such hard things. Anyway Boost has some articles about implementing block allocator. Or use already implemented preallocated map Stree

    0 讨论(0)
  • 2020-12-30 19:35

    No, the members of the map are internally stored in a tree structure. There is no way to build the tree until you know the keys and values that are to be stored.

    0 讨论(0)
  • 2020-12-30 19:46

    The short answer is: yes, this is possible, but it's not trivial. You need to define a custom allocator for your map. The basic idea is that your custom allocator will set aside a single block of memory for the map. As the map requires new nodes, the allocator will simply assign them addresses within the pre-allocated block. Something like this:

    std::map<KeyType, ValueType, std::less<KeyType>, MyAllocator> myMap;
    
    myMap.get_allocator().reserve( nodeSize * numberOfNodes );
    

    There are a number of issues you'll have to deal with, however.

    First, you don't really know the size of each map node or how many allocations the map will perform. These are internal implementation details. You can experiment to find out, but you can't assume that the results will hold across different compilers (or even future versions of the same compiler). Therefore, you shouldn't worry about allocating a "fixed" size map. Rather, your goal should be to reduce the number of allocations required to a handful.

    Second, this strategy becomes quite a bit more complex if you want to support deletion.

    Third, don't forget memory alignment issues. The pointers your allocator returns must be properly aligned for the various types of objects the memory will store.

    All that being said, before you try this, make sure it's necessary. Memory allocation can be very expensive, but you still shouldn't assume that it's a problem for your program. Measure to find out. You should also consider alternative strategies that more naturally allow pre-allocation. For example, a sorted list or a std::unordered_map.

    0 讨论(0)
  • 2020-12-30 19:46

    There are several good answers to this question already, but they miss some primary points.

    Initialize the map directly

    The map knows the size up front if initialized directly with iterators:

    auto mymap = std::map(it_begin, it_end);
    

    This is the best way to dodge the issue. If you are agnostic about the implementation, the map can then know the size up front from the iterators and you moved the issue to the std:: implementation to worry about.

    Alternatively use insert with iterators instead, that is:

    mymap.insert(it_begin, it_end);
    

    See: https://en.cppreference.com/w/cpp/container/map/insert

    Beware of Premature optimization

    but I feel I am wasting some computational time.

    This sounds a lot like you are optimization prematurely (meaning you do not know where the bottleneck is - you are gueessing or seeing an issue that isn't really one). Instead, measure first and then do optimization - repeat if neccesary.

    Memory allocation could already be optimized, to a large degree

    Rolling your own block allocator for the map could be close to fruitless. On modern system(her I include OS/hardware and the c++ language level) memory allocation is already very well optimized for the generel case and you could be looking at little or no improvement if rolling your own block allocator. Even if you take a lot of care and get the map into one contiguoes array - while an improvement in itself - you could still be facing the problem that in the end, the elements could be placed randomly in the array (eg. insertion order) and be less cache friendly anyway (this very much depending on your actual use case though - Im assuming a super large data-set).

    Use another container or third party map

    If you are still facing this issue - the best approach is probably to use another container (eg. a sorted std::vector - use std::lower_bound for lookups) or use a third party map optimized for how you are using the map. A good example is flat_map from boost - see this answer.

    Conclusion

    1. Let the std::map worry about the issue.
    2. When performance is the main issue: use a data structure (perhaps 3rd party) that best suits how your data is being used (random inserts or bulk inserts / mostly iteration or mostly lookups / etc.). You then need to profile and gather performance metrics to compare.
    0 讨论(0)
提交回复
热议问题