I am interested to serialize a boost::bimap
containing boost::dynamic_bitset
so that I can save that and load back when needed. I have made an attempt
Wow. You're not aiming for performance with that hash function.
See my BONUS section below
Why this awkward dance (simplified code):
auto iter = index.begin();
// first left element of bimap
BS first_left = iter->left;
Index::left_iterator left_iter = index.left.find(first_left);
What is wrong with
auto left_iter = index.left.begin();
What do you think is the validity of an iterator when serialized? (See Iterator invalidation rules)
oa << left_iter;
I think loading a new datastructure from storage counts as "reallocation". Iterators or references to another datastructure are obviously meaningless here.
Erm. Now it's really getting confusing.
// first right element of bimap
auto pos = index.left.find(first_left);
Index::right_iterator right_iter = index.right.find(pos->second);
You call it the "first right element", but you do something ELSE: you find the iterator corresponding to the first_left
key (which may well be the last element on the right. Also note that since the right hand side of the bimap is multiset_of
, there might be multiple matches and you random use the first.
(Side note: pos
is a useless duplication of left_iter
's value)
See 3.
oa << right_iter;
Varia:
make sure you open the files as binary
std::ofstream ofs("binaryfile", std::ios::binary);
std::ifstream ifs("binaryfile", std::ios::binary);
why do you name a container with value-semantics index_reference
? That's just unnecessarily confusing
SerializableType
is unusedBOOST_SERIALIZATION_NVP
is meaningless for binary archives (nodes have no names in those)I suppose, the real question might have been "how do I serialize the Bitset
s?". I'm happy to inform you I wrote the required bits in 2015: How to serialize boost::dynamic_bitset? and the pull request has been accepted into Boost starting with version 1.64.
So, you can sit back, sip your tea and include:
#include <boost/dynamic_bitset/serialization.hpp>
All done.
Since that serialization achieves a minimal-copy serialization, why not use it to power the hash function? The serialization mechanism will provide you the required private access.
I've abused serialization plumbing for hash<>
specializations before: Hash an arbitrary precision value (boost::multiprecision::cpp_int)
Live On Coliru
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/bimap.hpp>
#include <boost/bimap/unordered_multiset_of.hpp>
#include <boost/bimap/unordered_set_of.hpp>
#include <boost/dynamic_bitset/serialization.hpp>
#include <fstream>
#include <iostream>
#include <string>
#include <boost/iostreams/device/back_inserter.hpp>
#include <boost/iostreams/stream_buffer.hpp>
#include <boost/iostreams/stream.hpp>
#include <boost/functional/hash.hpp>
namespace serial_hashing { // see https://stackoverflow.com/questions/30097385/hash-an-arbitrary-precision-value-boostmultiprecisioncpp-int
namespace io = boost::iostreams;
struct hash_sink {
hash_sink(size_t& seed_ref) : _ptr(&seed_ref) {}
typedef char char_type;
typedef io::sink_tag category;
std::streamsize write(const char* s, std::streamsize n) {
boost::hash_combine(*_ptr, boost::hash_range(s, s+n));
return n;
}
private:
size_t* _ptr;
};
template <typename T> struct hash_impl {
size_t operator()(T const& v) const {
using namespace boost;
size_t seed = 0;
{
iostreams::stream<hash_sink> os(seed);
archive::binary_oarchive oa(os, archive::no_header | archive::no_codecvt);
oa << v;
}
return seed;
}
};
}
namespace std {
template <typename Block, typename Alloc> struct hash<boost::dynamic_bitset<Block, Alloc> >
: serial_hashing::hash_impl<boost::dynamic_bitset<Block, Alloc> >
{};
} // namespace std
namespace bimaps = boost::bimaps;
using Bitset = boost::dynamic_bitset<>;
typedef boost::bimap<
bimaps::unordered_set_of<Bitset, std::hash<Bitset> >,
bimaps::unordered_multiset_of<Bitset, std::hash<Bitset> > > Index;
int main() {
using namespace std::string_literals;
{
std::cout << "# Writing binary file ... " << std::endl;
Index index;
index.insert({Bitset("10010"s), Bitset("1010110110101010101"s)});
std::ofstream ofs("binaryfile", std::ios::binary);
boost::archive::binary_oarchive oa(ofs);
oa << index;
}
{
std::cout << "# Loading binary file ... " << std::endl;
std::ifstream ifs("binaryfile", std::ios::binary); // name of loading file
boost::archive::binary_iarchive ia(ifs);
Index index;
ia >> index;
}
}
Prints
# Writing binary file ...
# Loading binary file ...
No problem.
Really, save yourself trouble. Since your usage clearly indicates you do not want unordered semantics, just make it ordered:
Live On Coliru
#include <boost/archive/binary_iarchive.hpp>
#include <boost/archive/binary_oarchive.hpp>
#include <boost/bimap.hpp>
#include <boost/bimap/multiset_of.hpp>
#include <boost/dynamic_bitset/serialization.hpp>
#include <fstream>
#include <iostream>
namespace bimaps = boost::bimaps;
using Bitset = boost::dynamic_bitset<>;
typedef boost::bimap<bimaps::set_of<Bitset>, bimaps::multiset_of<Bitset>> Index;
int main() {
using namespace std::string_literals;
{
std::cout << "# Writing binary file ... " << std::endl;
Index index;
index.insert({Bitset("10010"s), Bitset("1010110110101010101"s)});
std::ofstream ofs("binaryfile", std::ios::binary);
boost::archive::binary_oarchive oa(ofs);
oa << index;
}
{
std::cout << "# Loading binary file ... " << std::endl;
std::ifstream ifs("binaryfile", std::ios::binary); // name of loading file
boost::archive::binary_iarchive ia(ifs);
Index index;
ia >> index;
}
}
Down to 36 lines, less than half the code left.