How to optimize vlookup for high search count ? (alternatives to VLOOKUP)

前端 未结 4 1196
名媛妹妹
名媛妹妹 2020-11-27 15:48

I am looking for alternatives to vlookup, with improved performance within the context of interest.

The context is the following:

  • I have a data set of
相关标签:
4条回答
  • 2020-11-27 15:48

    Switch to Excel 2013 and use Data Model. There you can define a column with unique ID keys in both tables and bind those two tables with relationship in Pivot Table. Than if absolutely necessary you can use Getpivotdata() to fill the first table. I had a ~250K rows table doing vlookup in the similar ~250K rows table. Stopped Excel calculating it after an hour. With Data Model it took less than 10sec.

    0 讨论(0)
  • 2020-11-27 15:57

    Value fix: check for a blank cell when building the dictionary. If the cell is blank, exit for.

    0 讨论(0)
  • 2020-11-27 16:02

    I considered the following alternatives:

    • VLOOKUP array-formula
    • MATCH / INDEX
    • VBA (using a dictionary)

    The compared performance is:

    • VLOOKUP simple formula : ~10 minutes
    • VLOOKUP array-formula : ~10 minutes (1:1 performance index)
    • MATCH / INDEX : ~2 minutes (5:1 performance index)
    • VBA (using a dictionary) : ~6 seconds (100:1 performance index)

    Using the same reference sheet

    1) Lookup sheet: (vlookup array formula version)

             A          B
         1
         2   key51359    {=VLOOKUP(A2:A10001;sheet1!$A$2:$B$100001;2;FALSE)}
         3   key41232    formula in B2
         4   key10102    ... extends to
       ...   ...         ... 
     99999   key4153     ... cell B100001
    100000   key12818    ... (select whole range, and press
    100001   key35032    ... CTRL+SHIFT+ENTER to make it an array formula)
    100002
    

    2) Lookup sheet: (match+index version)

             A           B                                       C
          1
          2  key51359    =MATCH(A2;sheet1!$A$2:$A$100001;)       =INDEX(sheet1!$B$2:$B$100001;B2)
          3  key41232    =MATCH(A3;sheet1!$A$2:$A$100001;)       =INDEX(sheet1!$B$2:$B$100001;B3)
          4  key10102    =MATCH(A4;sheet1!$A$2:$A$100001;)       =INDEX(sheet1!$B$2:$B$100001;B4)
        ...  ...         ...                                     ...
      99999  key4153     =MATCH(A99999;sheet1!$A$2:$A$100001;)   =INDEX(sheet1!$B$2:$B$100001;B99999)
     100000  key12818    =MATCH(A100000;sheet1!$A$2:$A$100001;)  =INDEX(sheet1!$B$2:$B$100001;B100000)
     100001  key35032    =MATCH(A100001;sheet1!$A$2:$A$100001;)  =INDEX(sheet1!$B$2:$B$100001;B100001)
     100002
    

    3) Lookup sheet: (vbalookup version)

           A          B
         1
         2  key51359    {=vbalookup(A2:A50001;sheet1!$A$2:$B$100001;2)}
         3  key41232    formula in B2
         4  key10102    ... extends to
       ...  ...         ...
     50000  key91021    ... 
     50001  key42       ... cell B50001
     50002  key21873    {=vbalookup(A50002:A100001;sheet1!$A$2:$B$100001;2)}
     50003  key31415    formula in B50001 extends to
       ...  ...         ...
     99999  key4153     ... cell B100001
    100000  key12818    ... (select whole range, and press
    100001  key35032    ... CTRL+SHIFT+ENTER to make it an array formula)
    100002
    

    NB : For some (external internal) reason, the vbalookup fails to return more than 65536 data at a time. So I had to split the array formula in two.

    and the associated VBA code :

    Function vbalookup(lookupRange As Range, refRange As Range, dataCol As Long) As Variant
      Dim dict As New Scripting.Dictionary
      Dim myRow As Range
      Dim I As Long, J As Long
      Dim vResults() As Variant
    
      ' 1. Build a dictionnary
      For Each myRow In refRange.Columns(1).Cells
        ' Append A : B to dictionnary
        dict.Add myRow.Value, myRow.Offset(0, dataCol - 1).Value
      Next myRow
    
      ' 2. Use it over all lookup data
      ReDim vResults(1 To lookupRange.Rows.Count, 1 To lookupRange.Columns.Count) As Variant
      For I = 1 To lookupRange.Rows.Count
        For J = 1 To lookupRange.Columns.Count
          If dict.Exists(lookupRange.Cells(I, J).Value) Then
            vResults(I, J) = dict(lookupRange.Cells(I, J).Value)
          End If
        Next J
      Next I
    
      vbalookup = vResults
    End Function
    

    NB: Scripting.Dictionary requires a referenc to Microsoft Scripting Runtime which must be added manually (Tools->References menu in the Excel VBA window)

    Conclusion :

    In this context, VBA using a dictionary is 100x faster than using VLOOKUP and 20x faster than MATCH/INDEX

    0 讨论(0)
  • 2020-11-27 16:11

    You also may want to consider using the “double Vlookup” method (not my idea - seen elsewhere). I tested it on 100,000 lookup values on sheet 2 (randomly sorted) with an identical data set as the one you’ve described on sheet 1, and timed it at just under 4 seconds. The code is also a bit simpler.

    Sub FastestVlookup()
    
        With Sheet2.Range("B1:B100000")
            .FormulaR1C1 = _
            "=IF(VLOOKUP(RC1,Sheet1!R1C1:R100000C1,1)=RC1,VLOOKUP(RC1,Sheet1!R1C1:R100000C2,2),""N/A"")"
            .Value = .Value
        End With
    
    End Sub
    
    0 讨论(0)
提交回复
热议问题