Reading a null-terminated string

前端 未结 4 738
醉梦人生
醉梦人生 2021-01-12 12:34

I am reading strings from a binary file. Each string is null-terminated. Encoding is UTF-8. In python I simply read a byte, check if it\'s 0, append it to a byte array, and

相关标签:
4条回答
  • 2021-01-12 12:48

    If your "binary file" only contains null terminated UTF8 strings, then for .NET it isn't a "binary file" but just a text file because null characters are characters too. So you could just use a StreamReader to read the text and split it on the null characters. (Six years later "you" would presumably be some new reader and not the OP.)

    A one line (ish) solution would be:

    using (var rdr = new StreamReader(path))
        return rdr.ReadToEnd().split(new char[] { '\0' });
    

    But that will give you a trailing empty string if the last string in the file was "properly" terminated.

    A more verbose solution that might perform differently for very large files, expressed as an extension method on StreamReader, would be:

    List<string> ReadAllNullTerminated(this System.IO.StreamReader rdr)
    {
        var stringsRead = new System.Collections.Generic.List<string>();
        var bldr = new System.Text.StringBuilder();
        int nc;
        while ((nc = rdr.Read()) != -1)
        {
            Char c = (Char)nc;
            if (c == '\0')
            {
                stringsRead.Add(bldr.ToString());
                bldr.Length = 0;
            }
            else
                bldr.Append(c);
        }
    
        // Optionally return any trailing unterminated string
        if (bldr.Length != 0)
            stringsRead.Add(bldr.ToString());
    
        return stringsRead;
    }
    

    Or for reading just one at a time (like ReadLine)

    string ReadNullTerminated(this System.IO.StreamReader rdr)
    {
        var bldr = new System.Text.StringBuilder();
        int nc;
        while ((nc = rdr.Read()) > 0)
            bldr.Append((char)nc);
    
        return bldr.ToString();
    }
    
    0 讨论(0)
  • 2021-01-12 12:59

    You can either use a List<byte>:

    List<byte> list = new List<byte>();
    while(reading){ //or whatever your condition is
        list.add(readByte);
    }
    
    string output = Encoding.UTF8.GetString(list.ToArray());
    

    Or you could use a StringBuilder :

    StringBuilder builder = new StringBuilder();
    
    while(reading){
        builder.Append(readByte);
    }
    
    string output = builder.ToString();
    
    0 讨论(0)
  • 2021-01-12 13:00

    I assume you're using a StreamReader instance:

    StringBuilder sb = new StringBuilder();
    using(StreamReader rdr = OpenReader(...)) {
        Int32 nc;
        while((nc = rdr.Read()) != -1) {
              Char c = (Char)nc;
              if( c != '\0' ) sb.Append( c );
        }
    }
    
    0 讨论(0)
  • 2021-01-12 13:05

    Following should get you what you are looking for. All of text should be inside myText list.

    var data = File.ReadAllBytes("myfile.bin");
    List<string> myText = new List<string>();
    int lastOffset = 0;
    for (int i = 0; i < data.Length; i++)
    {
        if (data[i] == 0)
        {
            myText.Add(System.Text.Encoding.UTF8.GetString(data, lastOffset, i - lastOffset));
            lastOffset = i + 1;
        }
    }
    
    0 讨论(0)
提交回复
热议问题