I've downloaded a dataset which details all of the car accidents reported in England between January 1979 and December 2004 - this file is in csv format and is understandably quite large (6,224,199 rows, to be exact). Because the size of the file exceeds the number of rows that Excel 2010 can handle, I'd have to split the file into smaller ones in order to open it all at once in Excel. I tried using Notepad and Notepad++, but Notepad crashed, and Notepad++ refused to open such a large (720MB) file. I've considered using an Excel replacement like Delimit, but it doesn't support Macros. Now, overlooking the size issue, I need to count the total number of crashes from each month and make a note of them. There's a column to specify the date of each crash, but the rows aren't sorted according to the crash date. I was considering using CTRL+F to count the number of rows with a specific month/year value and then logging the number of results for each search, but considering that the data spans 25 years, I'd have to manually search and record the results from 300 months.
I agree with Jeanno and Brad, Access is a better tool than Excel for this type of requirement. However, I wondered if an attempt to read such a large file with Excel would have a realistic duration.
I concatenated some large text files to create a file of 663 Mb which I thought was close enough. The macro below read each line of the file and splits it into fields ready for analysis. Note: my file uses "|" as a delimiter instead of ",".
The macro reads 7,782,013 records in a little over 100 seconds. Access is still the better option but Excel is feasible if Access is not available.
Note: this macro needs a reference to "Microsoft Scripting Runtime".
Sub ReadAndSplit()
Dim FileStream As TextStream
Dim FileSysObj As FileSystemObject
Dim Line As String
Dim LinePart() As String
Dim NumLines As Long
Dim TimeStart As Double
TimeStart = Timer
Set FileSysObj = CreateObject("Scripting.FileSystemObject")
NumLines = 0
' 1 means open read only
Set FileStream = FileSysObj.OpenTextFile(ThisWorkbook.Path & "\Test4.txt", 1)
Do While Not FileStream.AtEndOfStream
Line = FileStream.ReadLine
NumLines = NumLines + 1
LinePart = Split(Line, "|")
Debug.Print NumLines
Debug.Print Timer - TimeStart
End Sub