What is a superfast way to read large files line-by-line in VBA?

后端 未结 9 441
南方客
南方客 2020-12-01 09:46

I believe I have come up with a very efficient way to read very, very large files line-by-line. Please tell me if you know of a better/faster way or see room for improvemen

相关标签:
9条回答
  • 2020-12-01 10:12

    Be careful when using Application.Transpose with a huge number of values. If you transpose values to a column, excel will assume you are assuming you transposed them from rows.


    Max Column Limit < Max Row Limit, and it will only display the first (Max Column Limit) values, and anithing after that will be "N/A"

    0 讨论(0)
  • 2020-12-01 10:17

    I just wanted to share some of my results...

    I have text files, which apparently came from a Linux system, so I only have a vbLF/Chr(10) at the end of each line and not vbCR/Chr(13).

    Note 1:

    • This meant that the Line Input method would read in the entire file, instead of just one line at a time.

    From my research testing small (152KB) & large (2778LB) files, both on and off the network I found the following:

    Open FileName For Input: Line Input was the slowest (See Note 1 above)

    Open FileName For Binary Access Read: Input was the fastest for reading the whole file

    FSO.OpenTextFile: ReadLine was fast, but a bit slower then Binary Input

    Note 2:

    • If I just needed to check the file header (first 1-2 lines) to check if I had the proper file/format, then FSO.OpenTextFile was the fastest, followed very closely by Binary Input.

    • The drawback with the Binary Input is that you have to know how many characters you want to read.

    • On normal files, Line Input would also be a good option as well, but I couldn't test due to Note 1.

     

    Note 3:

    • Obviously, the files on the network showed the largest difference in read speed. They also showed the greatest benefit from reading the file a second time (although there are certainly memory buffers that come into play here).
    0 讨论(0)
  • 2020-12-01 10:18

    With that code you load the file in memory (as a big string) and then you read that string line by line.

    By using Mid$() and InStr() you actually read the "file" twice but since it's in memory, there is no problem.
    I don't know if VB's String has a length limit (probably not) but if the text files are hundreds of megabyte in size it's likely to see a performance drop, due to virtual memory usage.

    0 讨论(0)
  • 2020-12-01 10:21

    You can use Scripting.FileSystemObject to do that thing. From the Reference:

    The ReadLine method allows a script to read individual lines in a text file. To use this method, open the text file, and then set up a Do Loop that continues until the AtEndOfStream property is True. (This simply means that you have reached the end of the file.) Within the Do Loop, call the ReadLine method, store the contents of the first line in a variable, and then perform some action. When the script loops around, it will automatically drop down a line and read the second line of the file into the variable. This will continue until each line has been read (or until the script specifically exits the loop).

    And a quick example:

    Set objFSO = CreateObject("Scripting.FileSystemObject")
    Set objFile = objFSO.OpenTextFile("C:\FSO\ServerList.txt", 1)
    Do Until objFile.AtEndOfStream
     strLine = objFile.ReadLine
     MsgBox strLine
    Loop
    objFile.Close
    
    0 讨论(0)
  • 2020-12-01 10:21

    'you can modify above and read full file in one go and then display each line as shown below

    Option Explicit
    
    Public Function QuickRead(FName As String) As Variant
        Dim i As Integer
        Dim res As String
        Dim l As Long
        Dim v As Variant
    
        i = FreeFile
        l = FileLen(FName)
        res = Space(l)
        Open FName For Binary Access Read As #i
        Get #i, , res
        Close i
        'split the file with vbcrlf
        QuickRead = Split(res, vbCrLf)
    End Function
    
    Sub Test()
        ' you can replace file for "c:\writename.txt to any file name you desire
        Dim strFilePathName As String: strFilePathName = "C:\writename.txt"
        Dim strFileLine As String
        Dim v As Variant
        Dim i As Long
        v = QuickRead(strFilePathName)
        For i = 0 To UBound(v)
            MsgBox v(i)
        Next
    End Sub
    
    0 讨论(0)
  • 2020-12-01 10:22

    My two cents…

    Not long ago I needed reading large files using VBA and noticed this question. I tested the three approaches to read data from a file to compare its speed and reliability for a wide range of file sizes and line lengths. The approaches are:

    1. Line Input VBA statement
    2. Using the File System Object (FSO)
    3. Using Get VBA statement for the whole file and then parsing the string read as described in posts here

    Each test case consists of three steps:

    1. Test case setup that writes a text file containing given number of lines of the same given length filled by the known character pattern.
    2. Integrity test. Read each file line and verify its length and contents.
    3. File read speed test. Read each line of the file repeated 10 times.

    As you can notice, Step #3 verifies the true file read speed (as asked in the question) while Step #2 verifies the file read integrity and therefore simulates real conditions when string parsing is needed.

    The following chart shows the test results for the File read speed test. The file size is 64M bytes for all tests, and the tests differ in line length that varies from 2 bytes (not including CRLF) to 8M bytes.

    No idea why it is not displayed any longer :(

    CONCLUSION:

    1. All the three methods are reliable for large files with normal and abnormal line lengths (please compare to Graeme Howard’s answer)
    2. All the three methods produce almost equivalent file reading speed for normal line lengths
    3. “Superfast way” (Method #3) works fine for extremely long lines while the other two don’t.
    4. All this is applicable to different Offices, different PCs, for VBA and VB6
    0 讨论(0)
提交回复
热议问题