How can I count and log the number of rows in a sheet with a specific month/year value

▼魔方 西西 提交于 2020-01-17 11:15:56

问题


I've downloaded a dataset which details all of the car accidents reported in England between January 1979 and December 2004 - this file is in csv format and is understandably quite large (6,224,199 rows, to be exact). Because the size of the file exceeds the number of rows that Excel 2010 can handle, I'd have to split the file into smaller ones in order to open it all at once in Excel. I tried using Notepad and Notepad++, but Notepad crashed, and Notepad++ refused to open such a large (720MB) file. I've considered using an Excel replacement like Delimit, but it doesn't support Macros. Now, overlooking the size issue, I need to count the total number of crashes from each month and make a note of them. There's a column to specify the date of each crash, but the rows aren't sorted according to the crash date. I was considering using CTRL+F to count the number of rows with a specific month/year value and then logging the number of results for each search, but considering that the data spans 25 years, I'd have to manually search and record the results from 300 months.


回答1:


I agree with Jeanno and Brad, Access is a better tool than Excel for this type of requirement. However, I wondered if an attempt to read such a large file with Excel would have a realistic duration.

I concatenated some large text files to create a file of 663 Mb which I thought was close enough. The macro below read each line of the file and splits it into fields ready for analysis. Note: my file uses "|" as a delimiter instead of ",".

The macro reads 7,782,013 records in a little over 100 seconds. Access is still the better option but Excel is feasible if Access is not available.

Note: this macro needs a reference to "Microsoft Scripting Runtime".

Sub ReadAndSplit()

  Dim FileStream As TextStream
  Dim FileSysObj As FileSystemObject
  Dim Line As String
  Dim LinePart() As String
  Dim NumLines As Long
  Dim TimeStart As Double

  TimeStart = Timer

  Set FileSysObj = CreateObject("Scripting.FileSystemObject")
  NumLines = 0

  ' 1 means open read only
  Set FileStream = FileSysObj.OpenTextFile(ThisWorkbook.Path & "\Test4.txt", 1)

  Do While Not FileStream.AtEndOfStream
    Line = FileStream.ReadLine
    NumLines = NumLines + 1
    LinePart = Split(Line, "|")
  Loop

  FileStream.Close

  Debug.Print NumLines
  Debug.Print Timer - TimeStart

End Sub


来源:https://stackoverflow.com/questions/28419493/how-can-i-count-and-log-the-number-of-rows-in-a-sheet-with-a-specific-month-year

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!