C# BinaryWriter length prefix - UTF7 encoding

前端 未结 2 454
不知归路
不知归路 2021-01-03 01:54

I\'ve got a project using memory mapped files to let two apps share data with each other. The producer app is written in C#, the consumer app talks plain old C. Both use VS2

相关标签:
2条回答
  • 2021-01-03 01:58

    While the MSDN Documentation on BinaryWriter.Write states it “first writes the length of the string as a UTF-7 encoded unsigned integer”, it is wrong. First of all, UTF-7 is a string encoding, you cannot encode integers using UTF-7. What the documentation means (and the code does) is that it writes the length using variable-length 7-bit encoding, sometimes known as LEB128. In your specific case, the data bytes 80 02 mean the following:

    1000 0000 0000 0010

    Nbbb bbbb Eaaa aaaa

    • N set to one means this is not the final byte
    • E set to zero means this is the final byte
    • aaaaaaa and bbbbbbb are the real data; the result is therefore:

    00000100000000

    aaaaaaabbbbbbb

    I.e. 100000000 in binary, which is 256 in decimal.

    0 讨论(0)
  • 2021-01-03 02:11

    Despite what the Microsoft documentation says,

    1. The prefix number written is in fact an LEB128 encoded count.
    2. This is a byte count, not a character count.

    The Wiki page I linked gives you decoding code, but I would consider using my own scheme. You could convert the string to UTF8 manually using Encoding.GetBytes() and write that to the MMF, prefixing it with a normal unsigned short. That way you have complete control over everything.

    0 讨论(0)
提交回复
热议问题