Can the Encoding API decode a Stream/noncontinuous bytes?

后端 未结 3 1721
[愿得一人]
[愿得一人] 2021-01-28 06:49

Usually we can get a string from a byte[] using something like

var result = Encoding.UTF8.GetString(bytes);

However,

相关标签:
3条回答
  • 2021-01-28 07:17

    Working code based on Henk's answer using StreamReader:

        using (var memoryStream = new MemoryStream())
        {
            using (var reader = new StreamReader(memoryStream))
            {
                foreach (var byteSegment in bytes)
                {
                    memoryStream.Seek(0, SeekOrigin.Begin);
                    await memoryStream.WriteAsync(byteSegment, 0, byteSegment.Length);
                    memoryStream.Seek(0, SeekOrigin.Begin);
    
                    Debug.WriteLine(await reader.ReadToEndAsync());
                }
            }
        }
    
    0 讨论(0)
  • 2021-01-28 07:27

    however Encoding cannot work on Stream, just byte[].

    Correct but a StreamReader : TextReader can be linked to a Stream.

    So just create that MemoryStream, push bytes in on one end and use ReadLine() on the other. I must say I have never tried that.

    0 讨论(0)
  • 2021-01-28 07:29

    The Encoding class can't deal with that directly, but the Decoder returned from Encoding.GetDecoder() can (indeed, that's its entire reason for existing). StreamReader uses a Decoder internally.

    It's slightly fiddly to work with though, as it needs to populate a char[], rather than returning a string (Encoding.GetString() and StreamReader normally handle the business of populating the char[]).

    The problem with using a MemoryStream is that you're copying all of the bytes from one array to another, for no gain. If all of your buffers are the same length, you can do this:

    var decoder = Encoding.UTF8.GetDecoder();
    // +1 in case it includes a work-in-progress char from the previous buffer
    char[] chars = decoder.GetMaxCharCount(bufferSize) + 1;
    foreach (var byteSegment in bytes)
    {
        int numChars = decoder.GetChars(byteSegment, 0, byteSegment.Length, chars, 0);
        Debug.WriteLine(new string(chars, 0, numChars));
    }
    

    If the buffers have different lengths:

    var decoder = Encoding.UTF8.GetDecoder();
    char[] chars = Array.Empty<char>();
    foreach (var byteSegment in bytes)
    {
        // +1 in case it includes a work-in-progress char from the previous buffer
        int charsMinSize = decoder.GetMaxCharCount(bufferSize) + 1;
        if (chars.Length < charsMinSize)
            chars = new char[charsMinSize];
        int numChars = decoder.GetChars(byteSegment, 0, byteSegment.Length, chars, 0);
        Debug.WriteLine(new string(chars, 0, numChars));
    }
    
    0 讨论(0)
提交回复
热议问题