Array intersection function speed

ぃ、小莉子 提交于 2020-02-05 05:57:05

问题


I've written a short function for array intersection and wanted to know why one function is faster than the other.

1)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    Dim intersection As New ArrayList
    If (list1.Length < list2length) Then
        'use list2'
        For Each thing As String In list2
            If (Array.IndexOf(list1, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    Else
        'use list1'
        For Each thing As String In list1
            If (Array.IndexOf(list2, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    End If
    Return intersection
End Function

2)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    Dim intersection As New ArrayList
    If (list1.Length > list2length) Then 'changed >'
        'use list2'
        For Each thing As String In list2
            If (Array.IndexOf(list1, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    Else
        'use list1'
        For Each thing As String In list1
            If (Array.IndexOf(list2, thing) <> -1) Then
                intersection.Add(thing)
            End If
        Next
    End If
    Return intersection
End Function

3)

Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length

Function newintersect(ByRef list1() As String) As String()
    For Each thing As String In list1
        If (Array.IndexOf(list2, thing) <> -1) Then
            intersection.Add(thing)
        End If
    Next
    Return intersection
End Function

So for my testcase, 1 take 65 seconds, 3 takes 63 seconds, while 2 actually takes 75 seconds. Anyone know why 3 is the fastest? And why is 1 faster than 2?

(Sorry about the poor formatting...can't seem to paste it right)


回答1:


That's not much of a difference. Also, it doesn't seem like the methods would produce the same result, so it would be pointless to compare the performance, right?

Anyhow, the Array.IndexOf is not very efficient way to find items, and doesn't scale well. You should get a dramatic improvement if you use a hash key based collection as lookup instead, something like this:

Function newintersect(ByRef list1 As String(), ByRef list2 As String()) As String()
  Dim smaller As HashSet(Of String)
  Dim larger As String()
  If list1.Length < list2.Length Then
    smaller = New HashSet(Of String)(list1)
    larger = list2
  Else
    smaller = New HashSet(Of String)(list2)
    larger = list1
  End If
  Dim intersection As New List(Of String)
  For Each item As String In larger
    If smaller.Contains(item) Then
      intersection.Add(item)
    End If
  Next
  Return intersection.ToArray()
End Function



回答2:


I'd expect you'd find that with different test cases you can reverse the results you had above and reach a situation where 2 is fastest and 1 & 3 are slower.

It's difficult to comment without knowing the make-up of the test case, it'll depend on the location of 'intersecting' items within the two arrays - if these tend to be close the to front on one array and closer to the end of another then the nesting order of the array iteration / IndexOf will have markedly different performance.

BTW - there are better ways of performing an intersection - sorting one or other array and performing a BinarySearch is one means - using a Dictionary(Of String, ...) or similar is another - and either would result in far better performance.




回答3:


This is from the MSDN documentation

    Dim id1() As Integer = {44, 26, 92, 30, 71, 38}
    Dim id2() As Integer = {39, 59, 83, 47, 26, 4, 30}

    ' Find the set intersection of the two arrays.
    Dim intersection As IEnumerable(Of Integer) = id1.Intersect(id2)

    For Each id As Integer In intersection
        Debug.WriteLine(id.ToString)
    Next


来源:https://stackoverflow.com/questions/3672328/array-intersection-function-speed

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!