How to generate md5-hashes for large files with VBA?

前端 未结 2 845
耶瑟儿~
耶瑟儿~ 2021-01-13 05:07

I have the following functions to generate md5-hashes for files. The functions work great for small files, but crashes and generate Run-time error 7 - Out of memory<

相关标签:
2条回答
  • 2021-01-13 05:40

    It looks like you reached the memory limit. A better way would be to compute the MD5 of the file by block:

    Public Function ComputeMD5(filepath As String) As String
      Dim buffer() As Byte, svc As Object, hFile%, blockSize&, i&
      blockSize = 2 ^ 16
    
      ' open the file '
    
      If Len(Dir(filepath)) Then Else Err.Raise 5, , "file not found" & vbCr & filepath
    
      hFile = FreeFile
      Open filepath For Binary Access Read As hFile
    
      ' allocate buffer '
    
      If LOF(hFile) < blockSize Then blockSize = ((LOF(hFile) + 1024) \ 1024) * 1024
      ReDim buffer(0 To blockSize - 1)
    
      ' compute hash '
    
      Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
    
      For i = 1 To LOF(hFile) \ blockSize
        Get hFile, , buffer
        svc.TransformBlock buffer, 0, blockSize, buffer, 0
      Next
    
      Get hFile, , buffer
      svc.TransformFinalBlock buffer, 0, LOF(hFile) Mod blockSize
      buffer = svc.Hash
    
      ' cleanup '
    
      svc.Clear
      Close hFile
    
      ' convert to an hexa string '
    
      ComputeMD5 = String$(32, "0")
    
      For i = 0 To 15
         Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
      Next
    
    End Function
    
    0 讨论(0)
  • 2021-01-13 05:59

    This is an extension to FlorentB's answer, which worked brilliantly for me until my files surpassed the 2GB LOF() size limit.

    I tried to adapt for getting file length by alternate means as follows:

    Public Function ComputeMD5(filepath As String) As String
        If Len(Dir(filepath)) Then Else Err.Raise 5, , "File not found." & vbCr & filepath
    
        Dim blockSize As Long: blockSize = 2 ^ 20
        Dim blockSize_f As Double
        Dim buffer() As Byte
        Dim fileLength As Variant
        Dim hFile As Integer
        Dim n_Reads As Long
        Dim i As Long
        Dim svc As Object: Set svc = CreateObject("System.Security.Cryptography.MD5CryptoServiceProvider")
    
        fileLength = DecGetFileSize(filepath)
        If fileLength < blockSize Then blockSize = ((fileLength + 1024) \ 1024) * 1024
        ReDim buffer(0 To blockSize - 1)
        n_Reads = fileLength / blockSize
        blockSize_f = fileLength - (CDbl(blockSize) * n_Reads)
    
        hFile = FreeFile
        Open filepath For Binary Access Read As hFile
        For i = 1 To n_Reads
            Get hFile, i, buffer
            svc.TransformBlock buffer, 0, blockSize, buffer, 0
        Next i
    
        Get hFile, i, buffer
        svc.TransformFinalBlock buffer, 0, blockSize_f
        buffer = svc.Hash
        svc.Clear
        Close hFile
    
        ComputeMD5 = String$(32, "0")
        For i = 0 To 15
            Mid$(ComputeMD5, i + i + 2 + (buffer(i) > 15)) = Hex(buffer(i))
        Next
    
    End Function
    
    Public Function DecGetFileSize(fname As String) As Variant
        Dim fso As New FileSystemObject
        Dim f: Set f = fso.GetFile(fname)
        DecGetFileSize = CDec(f.Size)
        Set f = Nothing
        Set fso = Nothing
    End Function
    

    This all runs fine, returning a string, however that string does not equal the MD5 calculated using other tools on the same file.

    I can't work out where the discrepancy is originating.

    I've checked and double checked filelength, n_reads, blockSize and blockSize_f and I'm sure those values are all correct.

    I had some trouble with the Get function, where if I didn't explicitly tell it the block number, it dies at block 2048.

    Any ideas / pointers would be much appreciated.

    0 讨论(0)
提交回复
热议问题