I believe I have come up with a very efficient way to read very, very large files line-by-line. Please tell me if you know of a better/faster way or see room for improvemen
My take on it...obviously, you've got to do something with the data you read in. If it involves writing it to the sheet, that'll be deadly slow with a normal For Loop. I came up with the following based upon a rehash of some of the items there, plus some help from the Chip Pearson website.
Reading in the text file (assuming you don't know the length of the range it will create, so only the startingCell is given):
Public Sub ReadInPlainText(startCell As Range, Optional textfilename As Variant)
If IsMissing(textfilename) Then textfilename = Application.GetOpenFilename("All Files (*.*), *.*", , "Select Text File to Read")
If textfilename = "" Then Exit Sub
Dim filelength As Long
Dim filenumber As Integer
filenumber = FreeFile
filelength = filelen(textfilename)
Dim text As String
Dim textlines As Variant
Open textfilename For Binary Access Read As filenumber
text = Space(filelength)
Get #filenumber, , text
'split the file with vbcrlf
textlines = Split(text, vbCrLf)
'output to range
Dim outputRange As Range
Set outputRange = startCell
Set outputRange = outputRange.Resize(UBound(textlines), 1)
outputRange.Value = Application.Transpose(textlines)
Close filenumber
End Sub
Conversely, if you need to write out a range to a text file, this does it quickly in one print statement (note: the file 'Open' type here is in text mode, not binary..unlike the read routine above).
Public Sub WriteRangeAsPlainText(ExportRange As Range, Optional textfilename As Variant)
If IsMissing(textfilename) Then textfilename = Application.GetSaveAsFilename(FileFilter:="Text Files (*.txt), *.txt")
If textfilename = "" Then Exit Sub
Dim filenumber As Integer
filenumber = FreeFile
Open textfilename For Output As filenumber
Dim textlines() As Variant, outputvar As Variant
textlines = Application.Transpose(ExportRange.Value)
outputvar = Join(textlines, vbCrLf)
Print #filenumber, outputvar
Close filenumber
End Sub
Line Input works fine for small files. However, when file sizes reach around 90k, Line Input jumps all over the place and reads data in the wrong order from the source file. I tested it with different filesizes:
49k = ok
60k = ok
78k = ok
85k = ok
93k = error
101k = error
127k = error
156k = error
Lesson learned - use Scripting.FileSystemObject
I would think , in a large file scenario using a stream would be far more efficient, because memory consumption would be very small.
But your algorithm could alternate between using a stream and loading the entire thing in memory based on the file size. I wouldn't be surprised if one is only better than the other under certain criteria.