fastest c++ serialization?

后端 未结 11 2450
花落未央
花落未央 2021-02-18 18:08

Good morning all,

I\'m searching for a very fast binary serialization technique for c++. I only need to serialize data contained in objects (no pointers etc.). I\'d like

相关标签:
11条回答
  • 2021-02-18 18:35

    The C++ Middleware Writer is an online alternative to serialization libraries. In some cases it is faster than the serialization library in Boost.

    0 讨论(0)
  • 2021-02-18 18:36

    Both your C and your C++ code will probably be dominated (in time) by file I/O. I would recommend using memory mapped files when writing your data and leave the I/O buffering to the operating system. Boost.Interprocess could be an alternative.

    0 讨论(0)
  • 2021-02-18 18:38

    Because I/O is most likely to be the bottleneck a compact format may help. Out of curiosity I tried the following Colfer scheme compiled as colf -s 16 C.

        package data
    
        type item struct {
                off  uint64
                size uint32
        }
    

    ... with a comparable C test:

        clock_t start = clock();
    
        data_item data;
        void* buf = malloc(colfer_size_max);
    
        FILE* fd = fopen( "test.colfer.dat", "wb" );
        for ( long i = 0; i < tests; i++ )
        {
           data.off = i;
           data.size = i & 0xFFFF;
           size_t n = data_item_marshal( &data, buf );
           fwrite( buf, n, 1, fd );
        }
        fclose( fd );
    
        clock_t stop = clock();
    

    The results are quite disappointing on SSD despite the fact that the serial size is 40% smaller in comparison to the raw struct dumps.

        colfer took   0.520 seconds
        plain took    0.320 seconds
    

    Since the generated code is pretty fast it seems unlikely you'll win anything with serialization libraries.

    0 讨论(0)
  • 2021-02-18 18:40

    google flatbuffers, similar to protocol buffer but a way faster

    https://google.github.io/flatbuffers/

    https://google.github.io/flatbuffers/md__benchmarks.html

    0 讨论(0)
  • 2021-02-18 18:41

    Is there any way you can take advantage of things that stay the same?

    I mean, you are just trying to run through "test.c.dat" as fast as you possibly can, right? Can you take advantage of the fact that the file does not change between your serialization attempts? If you are trying to serialize the same file, over and over again, you can optimize based on this. I can make the first serialization attempt take the same amount of time as yours, plus a tiny bit extra for another check, and then if you try and run the serialization again on the same input, I can make my second run go much faster than the first time.

    I understand that this may just be a carefully crafted example, but you seem to be focused on making the language accomplish your task as quickly as possible, instead of asking the question of "do I need to accomplish this again?" What is the context of this approach?

    I hope this is helpful.

    -Brian J. Stinar-

    0 讨论(0)
提交回复
热议问题