How do I change my previously saved List type to serialize into an Array type

只愿长相守 提交于 2019-12-24 12:41:47

问题


Previously, we serialized a property as a List<byte> Now we want to change it to be a byte[]. It was out understanding that you should be able to swap out collection types freely between version but we get a ProtoBuf.ProtoException

[TestFixture, Category("Framework")]
class CollectionTypeChange 
{
    [Test]
    public void TestRoundTrip()
    {
        var bytes = new List<byte>() {1,2,4};
        var a = new ArrayHolder(bytes);

        var aCopy = Deserialize<ArrayHolder>(Serialize(a));

        //Passes
        Assert.That(aCopy.CollectionOfBytes, Is.EquivalentTo(a.CollectionOfBytes));
    }

    [Test]
    public void TestChangeArrayToList()
    {
        var bytes = new List<byte>() { 1, 2, 4 };
        var a = new ArrayHolder(bytes);

        var aCopy = Deserialize<ListHolder>(Serialize(a));

        //Passes
        Assert.That(aCopy.CollectionOfBytes, Is.EquivalentTo(a.CollectionOfBytes));
    }

    [Test]
    public void TestChangeListToArray()
    {
        var bytes = new List<byte>() { 1, 2, 4 };
        var a = new ListHolder(bytes);

        //Throws: ProtoBuf.ProtoException : Invalid wire-type; this usually means you have over-written a file without truncating or setting the length; see http://stackoverflow.com/q/2152978/23354
        var aCopy = Deserialize<ArrayHolder>(Serialize(a));

        Assert.That(aCopy.CollectionOfBytes, Is.EquivalentTo(a.CollectionOfBytes));
    }

    public static byte[] Serialize<T>(T obj)
    {
        using (var stream = new MemoryStream())
        {
            Serializer.Serialize(stream, obj);
            return stream.ToArray();
        }
    }

    public static T Deserialize<T>(byte[] buffer)
    {
        using (var stream = new MemoryStream(buffer))
        {
            return Serializer.Deserialize<T>(stream);
        }
    }
}

[ProtoContract]
internal class ArrayHolder
{
    private ArrayHolder()
    {
        CollectionOfBytes = new byte[0] {};
    }

    internal ArrayHolder(IEnumerable<byte> bytesToUse )
    {
        CollectionOfBytes = bytesToUse.ToArray();
    }

    [ProtoMember(1)]
    public byte[] CollectionOfBytes { get; set; }
}

[ProtoContract]
internal class ListHolder
{
    private ListHolder()
    {
        CollectionOfBytes = new List<byte>();
    }

    internal ListHolder(IEnumerable<byte> bytesToUse)
    {
        CollectionOfBytes = bytesToUse.ToList();
    }

    [ProtoMember(1)]
    public List<byte> CollectionOfBytes { get; set; }
}

Is there a special thing about arrays, or bytes that means this doesn't work like we expected?


回答1:


This looks to be a problem specifically with byte[] properties. If I change the property types to int [] and List<int> the behavior is not reproducible. The problem arises from the fact that there are two ways to encode an array in a Protocol Buffer: as repeated key/value pairs or "packed" as a single key with a length-delimited block of values.

For byte arrays, protobuf-net uses a special serializer, BlobSerializer, which simply writes the byte array length then block-copies the contents into the output buffer as a packed repeated field. It does the reverse operation when reading -- not handling the case when the data is actually in repeated key/value format.

On the other hand, List<byte> is serialized using the general-purpose ListDecorator. Its Read() method tests to see the format currently in the input buffer and reads it appropriately -- either packed or unpacked. Its Write() method, however, writes the byte array unpacked by default. Subsequently, when reading the buffer into a byte [] array, BlobSerializer throws an exception because the format is not as expected. Arguably this is a bug with protobuf-net's BlobSerializer.

There is, however, a straightforward workaround: state that the List<byte> should be serialized in packed format by setting IsPacked = true:

[ProtoContract]
internal class ListHolder
{
    private ListHolder()
    {
        CollectionOfBytes = new List<byte>();
    }

    internal ListHolder(IEnumerable<byte> bytesToUse)
    {
        CollectionOfBytes = bytesToUse.ToList();
    }

    [ProtoMember(1, IsPacked = true)]
    public List<byte> CollectionOfBytes { get; set; }
}

This should be a more compact representation for your list of bytes as well.

Unfortunately, the above workaround fails when the byte collection contains bytes with the high bit set. Protobuf-net serializes a packed List<byte> as a length-delimited sequence of Base 128 Varints. Thus when a byte with its high bit set is serialized, it is encoded as two bytes. On the other hand a byte [] member is serialized like a string as a length-delimited sequence of raw bytes. Thus one byte in the byte array is always encoded as byte in the encoding - which is incompatible with the encoding for List<byte>.

As a workaround, one could use a private surrogate List<byte> property in the ArrayHolder type:

[ProtoContract]
internal class ArrayHolder
{
    private ArrayHolder()
    {
        CollectionOfBytes = new byte[0] { };
    }

    internal ArrayHolder(IEnumerable<byte> bytesToUse)
    {
        CollectionOfBytes = bytesToUse.ToArray();
    }

    [ProtoIgnore]
    public byte[] CollectionOfBytes { get; set; }

    [ProtoMember(1, OverwriteList = true)]
    List<byte> ListOfBytes
    {
        get
        {
            if (CollectionOfBytes == null)
                return null;
            return new List<byte>(CollectionOfBytes);
        }
        set
        {
            if (value == null)
                return;
            CollectionOfBytes = value.ToArray();
        }
    }
}

Sample fiddle.

Alternatively, one could replace the ArrayHolder with a ListHolder during (de)serialization by using MetaType.SetSurrogate() as shown for instance in this answer.



来源:https://stackoverflow.com/questions/33429468/how-do-i-change-my-previously-saved-list-type-to-serialize-into-an-array-type

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!