Serializing function objects

纵然是瞬间 提交于 2019-11-29 05:50:03

No.

C++ has no built-in support for serialization and was never conceived with the idea of transmitting code from one process to another, lest one machine to another. Languages that may do so generally feature both an IR (intermediate representation of the code that is machine independent) and reflection.

So you are left with writing yourself a protocol for transmitting the actions you want, and the DSL approach is certainly workable... depending on the variety of tasks you wish to perform and the need for performance.

Another solution would be to go with an existing language. For example the Redis NoSQL database embeds a LUA engine and may execute LUA scripts, you could do the same and transmit LUA scripts on the network.

Yes for function pointers and closures. Not for std::function.

A function pointer is the simplest — it is just a pointer like any other so you can just read it as bytes:

template <typename _Res, typename... _Args>
std::string serialize(_Res (*fn_ptr)(_Args...)) {
  return std::string(reinterpret_cast<const char*>(&fn_ptr), sizeof(fn_ptr));
}

template <typename _Res, typename... _Args>
_Res (*deserialize(std::string str))(_Args...) {
  return *reinterpret_cast<_Res (**)(_Args...)>(const_cast<char*>(str.c_str()));
}                   

But I was surprised to find that even without recompilation the address of a function will change on every invocation of the program. Not very useful if you want to transmit the address. This is due to ASLR, which you can turn off on Linux by starting your_program with setarch $(uname -m) -LR your_program.

Now you can send the function pointer to a different machine running the same program, and call it! (This does not involve transmitting executable code. But unless you are generating executable code at run-time, I don't think you are looking for that.)

A lambda function is quite different.

std::function<int(int)> addN(int N) {
  auto f = [=](int x){ return x + N; };
  return f;
}

The value of f will be the captured int N. Its representation in memory is the same as an int! The compiler generates an unnamed class for the lambda, of which f is an instance. This class has operator() overloaded with our code.

The class being unnamed presents a problem for serialization. It also presents a problem for returning lambda functions from functions. The latter problem is solved by std::function.

std::function as far as I understand is implemented by creating a templated wrapper class which effectively holds a reference to the unnamed class behind the lambda function through the template type parameter. (This is _Function_handler in functional.) std::function takes a function pointer to a static method (_M_invoke) of this wrapper class and stores that plus the closure value.

Unfortunately, everything is buried in private members and the size of the closure value is not stored. (It does not need to, because the lambda function knows its size.)

So std::function does not lend itself to serialization, but works well as a blueprint. I followed what it does, simplified it a lot (I only wanted to serialize lambdas, not the myriad other callable things), saved the size of the closure value in a size_t, and added methods for (de)serialization. It works!

No, but there are some restricted solutions.

The most you can hope for is to register functions in some sort of global map (e.g. with key strings) that is common to the sending code and the receiving code (either in different computers or before and after serialization). You can then serialize the string associated with the function and get it on the other side.

As a concrete example the library HPX implements something like this, in something called HPX_ACTION.

This requires a lot of protocol and it is fragile with respect to changes in code.

But after all this is no different from something that tries to serialize a class with private data. In some sense the code of the function is its private part (the arguments and return interface is the public part).

What leaves you a slip of hope is that depending on how you organize the code these "objects" can be global or common and if all goes right they are available during serialization and deserialization through some kind predefined runtime indirection.

This is a crude example:

serializer code:

// common:
class C{
  double d;
  public:
  C(double d) : d(d){}
  operator(double x) const{return d*x;}
};
C c1{1.};
C c2{2.};
std::map<std::string, C*> const m{{"c1", &c1}, {"c2", &c2}};
// :common

main(int argc, char** argv){
   C* f = (argc == 2)?&c1:&c2;
   (*f)(5.); // print 5 or 10 depending on the runtime args
   serialize(f); // somehow write "c1" or "c2" to a file
}

deserializer code:

// common:
class C{
  double d;
  public:
  operator(double x){return d*x;}
};
C c1;
C c2;
std::map<std::string, C*> const m{{"c1", &c1}, {"c2", &c2}};
// :common

main(){
   C* f;
   deserialize(f); // somehow read "c1" or "c2" and assign the pointer from the translation "map"
   (*f)(3.); // print 3 or 6 depending on the code of the **other** run
}

(code not tested).

Note that this forces a lot of common and consistent code, but depending on the environment you might be able to guarantee this. The slightest change in the code can produce a hard to detect logical bug.

Also, I played here with global objects (which can be used on free functions) but the same can be done with scoped objects, what becomes trickier is how to establish the map locally (#include common code inside a local scope?)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!