What data structure is inside std::map in C++?

前端 未结 6 898
余生分开走
余生分开走 2020-12-03 11:34

I\'m beginner and learning C++ Having hard times to understand std::map concepts, because the code I\'m playing with implies that the map is a search tree, i.e.

相关标签:
6条回答
  • 2020-12-03 11:40

    Map internally uses self-balancing BST . Please have a look on this link.self-balancing binary search trees

    0 讨论(0)
  • 2020-12-03 11:51

    Step debug into g++ 6.4 stdlibc++ source

    Did you know that on Ubuntu's 16.04 default g++-6 package or a GCC 6.4 build from source you can step into the C++ library without any further setup?

    By doing that we easily conclude that a Red-black tree used in this implementation.

    This makes sense, since std::map, unlike std::unordered_map, can be traversed in key order, which would not be efficient in if a hash map were used.

    main.cpp

    #include <cassert>
    #include <map>
    
    int main() {
        std::map<int, int> m;
        m.emplace(1, -1);
        m.emplace(2, -2);
        assert(m[1] == -1);
        assert(m[2] == -2);
    }
    

    Compile and debug:

    g++ -g -std=c++11 -O0 -o main.out main.cpp
    gdb -ex 'start' -q --args main.out
    

    Now, if you step into s.emplace(1, -1) you immediately reach /usr/include/c++/6/bits/stl_map.h:

    556       template<typename... _Args>
    557     std::pair<iterator, bool>
    558     emplace(_Args&&... __args)
    559     { return _M_t._M_emplace_unique(std::forward<_Args>(__args)...); }
    

    which clearly just forwards to _M_t._M_emplace_unique.

    So we open the source file in vim and find the definition of _M_t:

        typedef _Rb_tree<key_type, value_type, _Select1st<value_type>,
                key_compare, _Pair_alloc_type> _Rep_type;
    
        /// The actual tree structure.
        _Rep_type _M_t;
    

    So _M_t is of type _Rep_type and _Rep_type is a _Rb_tree.

    OK, now that is enough evidence for me. If you don't believe that _Rb_tree is a Black-red tree, step a bit further and read the algorithm

    unordered_map uses hash table

    Same procedure, but replace map with unordered_map on the code.

    This makes sense, since std::unordered_map cannot be traversed in order, so the standard library chose hash map instead of Red-black tree, since hash map has a better amortized insert time complexity.

    Stepping into emplace leads to /usr/include/c++/6/bits/unordered_map.h:

    377       template<typename... _Args>
    378     std::pair<iterator, bool>
    379     emplace(_Args&&... __args)
    380     { return _M_h.emplace(std::forward<_Args>(__args)...); }
    

    So we open the source file in vim and search for the definition of _M_h:

          typedef __umap_hashtable<_Key, _Tp, _Hash, _Pred, _Alloc>  _Hashtable;
          _Hashtable _M_h;
    

    So hash table it is.

    std::set and std::unordered_set

    Analogous to std::map vs std::unordered_map: What is the underlying data structure of a STL set in C++?

    Performance characteristics

    You could also infer the data structure used by timing them:

    Graph generation procedure and Heap vs BST analysis and at: Heap vs Binary Search Tree (BST)

    Since std::map is analogous to std::set we clearly see for:

    • std::map, a logarithmic insertion time
    • std::unordered_map, a more complex hashmap pattern:

      • on the non-zoomed plot, we clearly see the backing dynamic array doubling on huge one off linearly increasing spikes
      • on the zoomed plot, we see that the times are basically constant and going towards 250ns, therefore much faster than the std::map, except for very small map sizes

        Several strips are clearly visible, and their inclination becomes smaller whenever the array doubles.

        I believe this is due to average linearly increasing linked list walks withing each bin. Then when the array doubles, we have more bins, so shorter walks.

    0 讨论(0)
  • 2020-12-03 11:52

    std::map is an associative container. The only requirement by the standard is that the container must have an associative container interface and behavior, the implementation is not defined. While the implementation fits the complexity and interface requirements, is a valid implementation.

    On the other hand, std::map is usually implemented with a red-black tree, as the reference says.

    0 讨论(0)
  • 2020-12-03 11:56

    Viewed externally a map is just an associative container: it behave externally as an "array" (supports an a[x] expression) where x can be whatever type (not necessarily integer) is "comparable by <" (hence ordered).

    But:

    • Because x can be any value, it cannot be a plain array (otherwise it must support whatever index value: if you assign a[1] and a[100] you need also the 2..99 elements in the middle)
    • Because it has to to be fast in insert and find at whatever position, it cannot be a "linear" structure (otherwise elements shold be shifted, and finding must be sequential, and the requirements are "less then proportional finding time".

    The most common implementation uses internally a self-balancing tree (each node is a key/value pair, and are linked togheter so that the left side has lower keys, and the right side has higer keys, so that seraching is re-conducted to a binary search), a multi-skip-list (fastest than tree in retrieval, slower in insert) or a hash-based table (where each x value is re-conducted to an index of an array)

    0 讨论(0)
  • 2020-12-03 11:59

    As chris have written, the standard doesn't define the internal structure of the std::map or std::set. It defines the interface and complexity requirements for operations like insertion of an element. Those data structures of course may be implemented as trees. For example the implementation shipped with VisualStudio is based on a red-black tree.

    0 讨论(0)
  • 2020-12-03 11:59

    I would say that if you think of a map as a pair you can't go wrong. Map can be implemented as a tree or a hash map, but the way it is implemented is not as important since you can be sure any implementation is STL is an efficient one.

    0 讨论(0)
提交回复
热议问题