I would like to fill an array in VBA with the row numbers of only rows which meet a certain criteria. I would like the fastest method possible (for example, something like <
Still around 2-3 times the time of the efficient variant array from Chris, but the technique is powerful and has application beyond this question
One point to note is that Application.Transpose
is limited to 65536 cells, so a longer range needs to be "chunked" into pieces.
Sub GetEm()
Dim x
x = Filter(Application.Transpose(Application.Evaluate("=IF(A1:A50000=""aa"",ROW(A1:a50000),""x"")")), "x", False)
End Sub
First copy the range to a variant array , then loop over the array
Arr = rngval
For I = 1 to ubound(arr)
If arr(I,1) = valMatch Then RowArray(x) = I: x = x + 1
Next
Building off what others have offered here, I've combined both methods along with some string manipulation to get the exact row numbers of any given range containing the desired match without looping.
The only note that differs from your code is that RowArray()
is a String
type. However, you could convert it to Long using CLng
as you strip numbers out as needed, if you need to do that.
Sub get_row_numbers()
Dim rowArray() As String, valRange As Range, valMatch As String
Dim wks As Worksheet, I As Long, strAddress As String
Set wks = Sheets(1)
valMatch = "aa"
With wks
Set valRange = .Range("A1:A11")
Dim strCol As String
strCol = Split(valRange.Address, "$")(1)
'-> capture the column name of the evaluated range
'-> NB -> the method below will fail if a multi column range is selected
With valRange
If Not .Find(valMatch) Is Nothing Then
'-> make sure valMatch exists, otherwise SpecialCells method will fail
.AutoFilter 1, valMatch
Set valRange = .SpecialCells(xlCellTypeVisible)
'-> choose only cells where ValMatch is found
strAddress = valRange.Address '-> capture address of found cells
strAddress = Replace(Replace(strAddress, ":", ""), ",", "") '-> remove any commas and colons
strAddress = Replace(strAddress, "$" & strCol & "$", ",") '-> replace $column$ with comma
strAddress = Right(strAddress, Len(strAddress) - 1) '-> remove leading comma
rowArray() = Split(strAddress, ",")
'-> test print
For I = 0 To UBound(rowArray())
Debug.Print rowArray(I)
Next
End If 'If Not .Find(valMatch) Is Nothing Then
End With ' With valRange
End With 'With wks
End Sub
You have your range hard-coded in the example. Do you have a spare column to the right? If so, you could fill the cells to the right with 0 if it's not a match, or the row number if it is. Then pull that into an array and filter it. No loops:
Sub NoLoop()
Dim valMatch As String
Dim rData As Excel.Range, rFormula As Excel.Range
Dim a As Variant, z As Variant
Set rData = ThisWorkbook.Worksheets(1).Range("A1:A11") 'hard-coded in original example
Set rFormula = ThisWorkbook.Worksheets(1).Range("B1:B11") ' I'm assuming this range is currently empty
valMatch = "aa" 'hard-coded in original example
'if it's a valid match, the cell will state its row number, otherwise 0
rFormula.FormulaR1C1 = "=IF(RC[-1]=""" & valMatch & """,ROW(RC),0)"
a = Application.Transpose(rFormula.Value)
z = Filter(a, 0, False) 'filters out the zeroes, you're left with an array of valid row numbers
End Sub
I have to credit Jon49 at One-dimensional array from Excel Range for the Application.Transpose trick to get a 1-d array.
There is an assumption in the question title: that a looping solution is slow and a non-looping solution is faster. So, I conducted some comparisons to check that.
Test Case
I created some sample data consisting of 50,000 samples, and 50% matching values. For the fastest methods I created two more sample sets, again with 50,000 rows and one with 10% matching rows, another with 90% matching row.
I ran each of the posted methods over this data in a loop, repeating the logic 10 times (so times are for processing a total of 500,000 rows).
50% 10% 90%
ExactaBox 1300 1240 1350 ms
Scott Holtzman 415000
John Bustos 12500
Chris neilsen 310 310 310
Brettdj 970 970 970
OP 1530 1320 1700
So the moral is clear: just because it includes a loop, doesn't make it slow. What is slow is access the worksheet, so you should make every effort to minimise that.
Update Added test of Brettdj's comment: single line of code
For completeness sake, here's my solution
Sub GetRows()
Dim valMatch As String
Dim rData As Range
Dim a() As Long, z As Variant
Dim x As Long, i As Long
Dim sCompare As String
Set rData = Range("A1:A50000")
z = rData
ReDim a(1 To UBound(z, 1))
x = 1
sCompare = "aa"
For i = 1 To UBound(z)
If z(i, 1) = sCompare Then a(x) = i: x = x + 1
Next
ReDim Preserve a(1 To x - 1)
End Sub
Everyone, thanks for your individual inputs.
ExactaBox, your solution has been much helpful to me. However, there is a catch in returning 0 value through formula
rFormula.FormulaR1C1= "=IF(RC[-1]=""" & valMatch & """,ROW(RC),0)".
Since VBA Filter function filters out values by making string comparisons, it would also filter out row numbers having zeroes in them. For example valid row numbers, 20, 30, 40 etc. shall also be filtered out because they contain zeroes, so it would be better to write a string in place of 0 in the formula, which could therefore be:
rFormula.FormulaR1C1= "=IF(RC[-1]=""" & valMatch & """,ROW(RC),""Valid"")"
as was also suggested by brettdj above, who used "x" string as the last argument.