Possible to calculate MD5 (or other) hash with buffered reads?

后端 未结 5 1591
长情又很酷
长情又很酷 2020-11-27 03:37

I need to calculate checksums of quite large files (gigabytes). This can be accomplished using the following method:

    private byte[] calcHash(string file         


        
相关标签:
5条回答
  • 2020-11-27 04:05

    Seems you can to use TransformBlock / TransformFinalBlock, as shown in this sample: Displaying progress updates when hashing large files

    0 讨论(0)
  • 2020-11-27 04:08

    Hash algorithms are expected to handle this situation and are typically implemented with 3 functions:

    hash_init() - Called to allocate resources and begin the hash.
    hash_update() - Called with new data as it arrives.
    hash_final() - Complete the calculation and free resources.

    Look at http://www.openssl.org/docs/crypto/md5.html or http://www.openssl.org/docs/crypto/sha.html for good, standard examples in C; I'm sure there are similar libraries for your platform.

    0 讨论(0)
  • 2020-11-27 04:14

    You use the TransformBlock and TransformFinalBlock methods to process the data in chunks.

    // Init
    MD5 md5 = MD5.Create();
    int offset = 0;
    
    // For each block:
    offset += md5.TransformBlock(block, 0, block.Length, block, 0);
    
    // For last block:
    md5.TransformFinalBlock(block, 0, block.Length);
    
    // Get the has code
    byte[] hash = md5.Hash;
    

    Note: It works (at least with the MD5 provider) to send all blocks to TransformBlock and then send an empty block to TransformFinalBlock to finalise the process.

    0 讨论(0)
  • 2020-11-27 04:14

    I like the answer above but for the sake of completeness, and being a more general solution, refer to the CryptoStream class. If you are already handling streams, it is easy to wrap your stream in a CryptoStream, passing a HashAlgorithm as the ICryptoTransform parameter.

    var file = new FileStream("foo.txt", FileMode.Open, FileAccess.Write);
    var md5 = MD5.Create();
    var cs = new CryptoStream(file, md5, CryptoStreamMode.Write);
    while (notDoneYet)
    {
        buffer = Get32MB();
        cs.Write(buffer, 0, buffer.Length);
    }
    System.Console.WriteLine(BitConverter.ToString(md5.Hash));
    

    You might have to close the stream before getting the hash (so the HashAlgorithm knows it's done).

    0 讨论(0)
  • 2020-11-27 04:23

    I've just had to do something similar, but wanted to read the file asynchronously. It's using TransformBlock and TransformFinalBlock and is giving me answers consistent with Azure, so I think it is correct!

    private static async Task<string> CalculateMD5Async(string fullFileName)
    {
      var block = ArrayPool<byte>.Shared.Rent(8192);
      try
      {
         using (var md5 = MD5.Create())
         {
             using (var stream = new FileStream(fullFileName, FileMode.Open, FileAccess.Read, FileShare.Read, 8192, true))
             {
                int length;
                while ((length = await stream.ReadAsync(block, 0, block.Length).ConfigureAwait(false)) > 0)
                {
                   md5.TransformBlock(block, 0, length, null, 0);
                }
                md5.TransformFinalBlock(block, 0, 0);
             }
             var hash = md5.Hash;
             return Convert.ToBase64String(hash);
          }
       }
       finally
       {
          ArrayPool<byte>.Shared.Return(block);
       }
    }
    
    0 讨论(0)
提交回复
热议问题