Serialize boost::bimap with boost::dynamic_bitset as key value pair

后端 未结 1 613
臣服心动
臣服心动 2021-01-21 14:48

I am interested to serialize a boost::bimap containing boost::dynamic_bitset so that I can save that and load back when needed. I have made an attempt

相关标签:
1条回答
  • 2021-01-21 15:25
    1. Wow. You're not aiming for performance with that hash function.

      • you're copying all the blocks on every key/value hash (e.g. on lookup, on insert)
      • you better never wish to use co-routines because that thread-local static will make your life miserable

      See my BONUS section below

    2. Why this awkward dance (simplified code):

      auto iter = index.begin();
      
      // first left element of bimap
      BS first_left = iter->left;
      Index::left_iterator left_iter = index.left.find(first_left);
      

      What is wrong with

      auto left_iter = index.left.begin();
      
    3. What do you think is the validity of an iterator when serialized? (See Iterator invalidation rules)

      oa << left_iter;
      

      I think loading a new datastructure from storage counts as "reallocation". Iterators or references to another datastructure are obviously meaningless here.

    4. Erm. Now it's really getting confusing.

      //  first right element of bimap
      auto pos = index.left.find(first_left);
      Index::right_iterator right_iter = index.right.find(pos->second);
      

      You call it the "first right element", but you do something ELSE: you find the iterator corresponding to the first_left key (which may well be the last element on the right. Also note that since the right hand side of the bimap is multiset_of, there might be multiple matches and you random use the first.

      (Side note: pos is a useless duplication of left_iter's value)

    5. See 3.

      oa << right_iter;
      
    6. Varia:

      • make sure you open the files as binary

        std::ofstream ofs("binaryfile", std::ios::binary);
        std::ifstream ifs("binaryfile", std::ios::binary);
        
      • why do you name a container with value-semantics index_reference? That's just unnecessarily confusing

      • SerializableType is unused
      • BOOST_SERIALIZATION_NVP is meaningless for binary archives (nodes have no names in those)

    The Real Question

    I suppose, the real question might have been "how do I serialize the Bitsets?". I'm happy to inform you I wrote the required bits in 2015: How to serialize boost::dynamic_bitset? and the pull request has been accepted into Boost starting with version 1.64.

    So, you can sit back, sip your tea and include:

    #include <boost/dynamic_bitset/serialization.hpp>
    

    All done.

    The BONUS Section

    Since that serialization achieves a minimal-copy serialization, why not use it to power the hash function? The serialization mechanism will provide you the required private access.

    I've abused serialization plumbing for hash<> specializations before: Hash an arbitrary precision value (boost::multiprecision::cpp_int)

    Putting It All Together

    Live On Coliru

    #include <boost/archive/binary_iarchive.hpp>
    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/bimap.hpp>
    #include <boost/bimap/unordered_multiset_of.hpp>
    #include <boost/bimap/unordered_set_of.hpp>
    #include <boost/dynamic_bitset/serialization.hpp>
    #include <fstream>
    #include <iostream>
    #include <string>
    
    #include <boost/iostreams/device/back_inserter.hpp>
    #include <boost/iostreams/stream_buffer.hpp>
    #include <boost/iostreams/stream.hpp>
    
    #include <boost/functional/hash.hpp>
    
    namespace serial_hashing { // see https://stackoverflow.com/questions/30097385/hash-an-arbitrary-precision-value-boostmultiprecisioncpp-int
        namespace io = boost::iostreams;
    
        struct hash_sink {
            hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
    
            typedef char         char_type;
            typedef io::sink_tag category;
    
            std::streamsize write(const char* s, std::streamsize n) {
                boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
                return n;
            }
          private:
            size_t* _ptr;
        };
    
        template <typename T> struct hash_impl {
            size_t operator()(T const& v) const {
                using namespace boost;
                size_t seed = 0;
                {
                    iostreams::stream<hash_sink> os(seed);
                    archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
                    oa << v;
                }
                return seed;
            }
        };
    }
    
    namespace std {
        template <typename Block, typename Alloc> struct hash<boost::dynamic_bitset<Block, Alloc> >
            : serial_hashing::hash_impl<boost::dynamic_bitset<Block, Alloc> > 
        {};
    } // namespace std
    
    namespace bimaps = boost::bimaps;
    using Bitset = boost::dynamic_bitset<>;
    
    typedef boost::bimap<
        bimaps::unordered_set_of<Bitset, std::hash<Bitset> >,
         bimaps::unordered_multiset_of<Bitset, std::hash<Bitset> > > Index;
    
    int main() {
        using namespace std::string_literals;
    
        {
            std::cout << "# Writing binary file ... " << std::endl;
            Index index;
            index.insert({Bitset("10010"s), Bitset("1010110110101010101"s)});
    
            std::ofstream ofs("binaryfile", std::ios::binary);
            boost::archive::binary_oarchive oa(ofs);
            oa << index;
        }
    
        {
            std::cout << "# Loading binary file ... " << std::endl;
            std::ifstream ifs("binaryfile", std::ios::binary); // name of loading file
    
            boost::archive::binary_iarchive ia(ifs);
    
            Index index;
            ia >> index;
        }
    }
    

    Prints

    # Writing binary file ... 
    # Loading binary file ... 
    

    No problem.

    POST SCRIPTUM

    Really, save yourself trouble. Since your usage clearly indicates you do not want unordered semantics, just make it ordered:

    Live On Coliru

    #include <boost/archive/binary_iarchive.hpp>
    #include <boost/archive/binary_oarchive.hpp>
    #include <boost/bimap.hpp>
    #include <boost/bimap/multiset_of.hpp>
    #include <boost/dynamic_bitset/serialization.hpp>
    #include <fstream>
    #include <iostream>
    
    namespace bimaps = boost::bimaps;
    using Bitset = boost::dynamic_bitset<>;
    
    typedef boost::bimap<bimaps::set_of<Bitset>, bimaps::multiset_of<Bitset>> Index;
    
    int main() {
        using namespace std::string_literals;
    
        {
            std::cout << "# Writing binary file ... " << std::endl;
            Index index;
            index.insert({Bitset("10010"s), Bitset("1010110110101010101"s)});
    
            std::ofstream ofs("binaryfile", std::ios::binary);
            boost::archive::binary_oarchive oa(ofs);
            oa << index;
        }
    
        {
            std::cout << "# Loading binary file ... " << std::endl;
            std::ifstream ifs("binaryfile", std::ios::binary); // name of loading file
    
            boost::archive::binary_iarchive ia(ifs);
    
            Index index;
            ia >> index;
        }
    }
    

    Down to 36 lines, less than half the code left.

    0 讨论(0)
提交回复
热议问题