BinaryFormatter alternatives

前端 未结 3 2182
旧巷少年郎
旧巷少年郎 2020-12-30 04:37

A BinaryFormatter-serialized array of 128³ doubles, takes up 50 MB of space. Serializing an array of 128³ structs with two double fields takes up 150 MB an

相关标签:
3条回答
  • 2020-12-30 04:43

    Serializing means that metadata is added so that the data can be safely deserialized, that's what's causing the overhead. If you serialize the data yourself without any metadata, you end up with 16 MB of data:

    foreach (double d in array) {
       byte[] bin = BitConverter.GetBytes(d);
       stream.Write(bin, 0, bin.Length);
    }
    

    This of course means that you have to deserialize the data yourself also:

    using (BinaryReader reader = new BinaryReader(stream)) {
       for (int i = 0; i < array.Length; i++) {
          byte[] data = reader.ReadBytes(8);
          array[i] = BitConverter.ToDouble(data, 0);
       }
    }
    
    0 讨论(0)
  • 2020-12-30 04:44

    If you use a BinaryWriter instead of a Serializer you will get the desired (mimimal) size.
    I'm not sure about the speed, but give it a try.

    On my system writing 32MB takes less than 0.5 seconds, including Open and Close of the stream.

    You will have to write your own for loops to write the data, like this:

    struct Pair
    {
        public double X, Y;
    }
    
    static void WritePairs(string filename, Pair[] data)
    {
        using (var fs = System.IO.File.Create(filename))
        using (var bw = new System.IO.BinaryWriter(fs))
        {
            for (int i = 0; i < data.Length; i++)
            {
                bw.Write(data[i].X);
                bw.Write(data[i].Y);
            }
        }
    }
    
    static void ReadPairs(string fileName, Pair[] data)
    {
        using (var fs = System.IO.File.OpenRead(fileName))
        using (var br = new System.IO.BinaryReader(fs))
        {
            for (int i = 0; i < data.Length; i++)
            {
                data[i].X = br.ReadDouble();
                data[i].Y = br.ReadDouble();
            }
        }
    }
    
    0 讨论(0)
  • 2020-12-30 04:57

    This is more of a comment but it's way too much for one... I'm not able to reproduce your results. There is, however, some additional overhead with the struct.

    My testing:

    -------------------------------------------------------------------------------
    Testing array of structs
    
    Size of double:  8
    Size of doubles.bin:  16777244
    Size per array item:  8
    Milliseconds to serialize:  143
    -------------------------------------------------------------------------------
    -------------------------------------------------------------------------------
    Testing array of structs
    
    Size of dd struct:  16
    Size of structs.bin:  52428991
    Size per array item:  25
    Milliseconds to serialize:  9678
    -------------------------------------------------------------------------------
    

    Code:

    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Runtime.Serialization;
    using System.Runtime.Serialization.Formatters.Binary;
    using System.IO;
    using System.Diagnostics;
    
    namespace ConsoleApplication5
    {
        class Program
        {
            static void Main(string[] args)
            {
                TestDoubleArray();
                TestStructArray();
            }
    
            private static void TestStructArray()
            {
    
                Stopwatch stopWatch = new Stopwatch();
                stopWatch.Start();
    
                dd[] d1 = new dd[2097152];
                BinaryFormatter f1 = new BinaryFormatter();
                f1.Serialize(File.Create("structs.bin"), d1);
    
                stopWatch.Stop();
    
                Debug.WriteLine("-------------------------------------------------------------------------------");
                Debug.WriteLine("Testing array of structs");
                Debug.WriteLine("");
                Debug.WriteLine("Size of dd struct:  " + System.Runtime.InteropServices.Marshal.SizeOf(typeof(dd)).ToString());
                FileInfo fi = new FileInfo("structs.bin");
                Debug.WriteLine("Size of structs.bin:  " + fi.Length.ToString());
                Debug.WriteLine("Size per array item:  " + (fi.Length / 2097152).ToString());
                Debug.WriteLine("Milliseconds to serialize:  " + stopWatch.ElapsedMilliseconds);
                Debug.WriteLine("-------------------------------------------------------------------------------");
            }
    
            static void TestDoubleArray()
            {
                Stopwatch stopWatch = new Stopwatch();
                stopWatch.Start();
    
                double[] d = new double[2097152];
                BinaryFormatter f = new BinaryFormatter();
                f.Serialize(File.Create("doubles.bin"), d);
    
                stopWatch.Stop();
    
                Debug.WriteLine("-------------------------------------------------------------------------------");
                Debug.WriteLine("Testing array of structs");
                Debug.WriteLine("");
                Debug.WriteLine("Size of double:  " + sizeof(double).ToString());
                FileInfo fi = new FileInfo("test.bin");
                Debug.WriteLine("Size of doubles.bin:  " + fi.Length.ToString());
                Debug.WriteLine("Size per array item:  " + (fi.Length / 2097152).ToString());
                Debug.WriteLine("Milliseconds to serialize:  " + stopWatch.ElapsedMilliseconds);
                Debug.WriteLine("-------------------------------------------------------------------------------");
            }
    
            [Serializable]
            struct dd
            {
                double a;
                double b;
            }
        }
    }
    
    0 讨论(0)
提交回复
热议问题