How to traverse files (conditionally) faster than using FileSystemObject

后端 未结 4 2001
走了就别回头了
走了就别回头了 2021-01-07 11:50

I\'ve written some VBA code using file objects to go into a folder, search for particular files (CSV) that meet certain criteria (contain \"HR\" in filename and created with

相关标签:
4条回答
  • 2021-01-07 12:22

    This ought to show some improvement, considering the ratio of HR files to total files (250 / 30,000).

    Using Dir Function, Minimize reliance on FileSystemObject

    The idea here is to use the Dir function first to get a list of all file names that contain the "HR" substring, and only use the FileSystemObject against those files to get the timestamp information -- there's no use incurring the overhead of FSO on every file in that directory.

    Then, we process only those files which match the "HR" criteria:

    Sub usingDir()
    Dim folderPath As String
    Dim fileName As String
    Dim filesToProcess As New Collection
    Dim item As Variant
    Dim fileDate As Date
    Dim firstDate As Date
    Dim secondDate As Date
    
    'Defining the user-input variables
    firstDate = Cells(2, "E").Value
    secondDate = Cells(3, "E").Value
    folderPath = "\\SRV-1\process\DUMP\"
    
    ' Gets a collection of files matching the "HR" criteria
    fileName = Dir(folderPath)
    Do While Not fileName = ""
        If InStr(fileName, "HR") > 0 Then
            'Only processing files with "HR"
            filesToProcess.Add (folderPath & fileName)
        End If
        fileName = Dir
    Loop
    
    'Now we deal only with the "HR" files:
    With CreateObject("Scripting.FileSystemObject")
        For Each item In filesToProcess
            ' Check the date last modified
            fileDate = .GetFile(item).DateLastModified ' modify as needed
            If firstDate < fileDate And secondDate > fileDate Then
                '
                '
                Debug.Print item
                'your code to Do Stuff goes here
                '
                '
                '
            End If
        Next
    End With
    End Sub
    

    UPDATE: Without Using the FileSystemObject

    This was nagging at me, and I figured there must be a way to get the timestamp information without relying on FileSystemObject. There is. We'll still use Dir to traverse the files, but now we'll eliminate any reference to FileSystemObject and replace with some fancy WinAPI function calls. Check out Chip Pearson's article here and download the .bas modules. You'll need the following two files imported to your VBProject:

    • modGetSetFileTimes
    • modTimeConversionFunctions

    And then you can do something like this:

    Option Explicit
    Sub withoutFSO()
    Dim folderPath As String
    Dim FileName As String
    Dim filesToProcess As New Collection
    Dim item As Variant
    Dim fileDate As Date
    Dim firstDate As Date
    Dim secondDate As Date
    
    'Defining the user-input variables
    firstDate = Cells(2, "E").Value
    secondDate = Cells(3, "E").Value
    folderPath = "\\Your\Path"
    
    ' Gets a collection of files matching the "HR" criteria and our Date range
    FileName = Dir(folderPath)
    Do While Not FileName = ""
        'Only processing files with "HR"
        If InStr(FileName, "HR") > 0 Then
            ' Only process files that meet our date criteria
            fileDate = CDate(modGetSetFileTimes.GetFileDateTime(CStr(item), FileDateLastModified))
            If firstDate < fileDate And secondDate > fileDate Then
                filesToProcess.Add (folderPath & FileName)
            End If
        End If
        FileName = Dir
    Loop
    
    'Now we deal only with the matching files:
    For Each item In filesToProcess
        Debug.Print item
        Debug.Print fileDate
        'your code to Do Stuff goes here
        '
        '
        '
    Next
    End Sub
    

    This should be an improvement even over my original answer, and, if combined with a more efficient manner of retrieving data (i.e., using ADO instead of Workbooks.Open, if possible) then you should be very optimized.

    0 讨论(0)
  • 2021-01-07 12:39

    Take a look at Power Query -- it's a Microsoft add-in for Excel versions 2012 & 2013, and built-in to 2016. Setting up PQ to do this will be amazingly fast, and the 'script' is reusable! No VBA needed.

    You can search and combine the multiple files on the specified criteria, but then merge or append to the new/master file, too. For efficiency, rather than processing each file individually, might I suggest gathering up all the data files (by your criteria), combining them to one table, then use the new table to merge/append to the new/master

    Hope this helps...

    0 讨论(0)
  • 2021-01-07 12:39

    It took a long time because for each interation you pass the information to the main worksheet.

    In this case is better use a multidimensional array to keep the information and in the end of the process you pass the the array info in the main worksheet.

    I dont know what information you get in each worksheet, soo i cant create an didatical example for you.

    0 讨论(0)
  • 2021-01-07 12:42

    In addition to using the Dir function instead of FileSystemObject, if you cannot automate PowerQuery, and all you need is the data and not the formatting, consider making a direct data connection to the source workbooks using ADODB.

    Add a reference to Microsoft ActiveX Data Objects 6.1 Library (via Tools -> References...). There may be versions other than 6.1; choose the highest.

    Then you can use something like the following code:

    Dim fso As New Scripting.FileSystemObject
    Dim filepath As Variant
    For Each filepath In filesToProcess
        ' Check the date last modified
        fileDate = fso.GetFile(item).DateLastModified ' modify as needed
        If firstDate < fileDate And secondDate > fileDate Then
    
            Dim connectionString As String
            connectionString = _
                "Provider=Microsoft.ACE.OLEDB.12.0;" & _
                "Data Source=""" & filepath & """;" & _
                "Extended Properties=""Excel 12.0;HDR=No"""
    
            Dim worksheetName As String
            worksheetName = "Sheet1"
            ' There can be multiple worksheets per workbook.
            ' If you are only interested in one worksheet per workbook, then fill in worksheetName somehow
            ' Otherwise, you will probably need an inner loop to iterate over all the worksheets
    
            Dim sql As String
            sql = _
                "SELECT * " & _
                "FROM [" & worksheetName & "$]"
    
            Dim rs As New ADODB.Recordset
            rs.Open sql, connectionString
    
            destinationWorksheet.Range("A1").CopyFromRecordset rs
    
            rs.Close
            Set rs = Nothing
        End If
    Next
    
    0 讨论(0)
提交回复
热议问题