What did I do wrong with parsing MNIST dataset with BinaryReader in C#?

前端 未结 2 564

I\'m parsing MNIST datasets in C# from: http://yann.lecun.com/exdb/mnist/

I\'m trying to read the first Int32 from a binary file:

File         


        
相关标签:
2条回答
  • 2021-01-13 16:06

    50855936 == 0x03080000. Or 0x00000803 when you reverse the bytes, required on almost any machine since little-endian has won the egg war. Close enough to 2049, no great idea what explains the offset of 2. Here's an extension method to help you read it:

      public static class BigEndianUtils {
          public static int ReadBigInt32(this BinaryReader br) {
              var bytes = br.ReadBytes(sizeof(Int32));
              if (BitConverter.IsLittleEndian) Array.Reverse(bytes);
              return BitConverter.ToInt32(bytes, 0);
          }
      }
    

    Add additional methods if the file contains more field types, just substitute Int32 in the snippet.

    0 讨论(0)
  • 2021-01-13 16:21

    It seems that your problem is somewhere else. Could you post a minimal compilable snippet that doesn't work as expected?

    For example, this snippet works exactly as expected - it creates a binary file of 8 bytes, which are two big-endian Int32s. The reader then correctly reads the data as the two integers.

    using (var str = File.Create("C:\\Test.dat"))
        using (var wr = new BinaryWriter(str))
        {
            wr.Write(2049);
            wr.Write(60000);
        }
    
    using (var str = File.Open("C:\\Test.dat", FileMode.Open))
        using (var rdr = new BinaryReader(str))
        {
            rdr.ReadInt32().Dump();
            rdr.ReadInt32().Dump();
        }
    

    However, the endianness is fixed. If you need to use MSB first, you need to read the bytes and convert them to integers yourself (or, you could of course invert the byte order using bitwise operations, if you're so inclined).

    0 讨论(0)
提交回复
热议问题