问题
I've written a short function for array intersection and wanted to know why one function is faster than the other.
1)
Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length
Function newintersect(ByRef list1() As String) As String()
Dim intersection As New ArrayList
If (list1.Length < list2length) Then
'use list2'
For Each thing As String In list2
If (Array.IndexOf(list1, thing) <> -1) Then
intersection.Add(thing)
End If
Next
Else
'use list1'
For Each thing As String In list1
If (Array.IndexOf(list2, thing) <> -1) Then
intersection.Add(thing)
End If
Next
End If
Return intersection
End Function
2)
Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length
Function newintersect(ByRef list1() As String) As String()
Dim intersection As New ArrayList
If (list1.Length > list2length) Then 'changed >'
'use list2'
For Each thing As String In list2
If (Array.IndexOf(list1, thing) <> -1) Then
intersection.Add(thing)
End If
Next
Else
'use list1'
For Each thing As String In list1
If (Array.IndexOf(list2, thing) <> -1) Then
intersection.Add(thing)
End If
Next
End If
Return intersection
End Function
3)
Dim list2() As String 'Assume it has values'
Dim list2length As Integer = list2.length
Function newintersect(ByRef list1() As String) As String()
For Each thing As String In list1
If (Array.IndexOf(list2, thing) <> -1) Then
intersection.Add(thing)
End If
Next
Return intersection
End Function
So for my testcase, 1 take 65 seconds, 3 takes 63 seconds, while 2 actually takes 75 seconds. Anyone know why 3 is the fastest? And why is 1 faster than 2?
(Sorry about the poor formatting...can't seem to paste it right)
回答1:
That's not much of a difference. Also, it doesn't seem like the methods would produce the same result, so it would be pointless to compare the performance, right?
Anyhow, the Array.IndexOf
is not very efficient way to find items, and doesn't scale well. You should get a dramatic improvement if you use a hash key based collection as lookup instead, something like this:
Function newintersect(ByRef list1 As String(), ByRef list2 As String()) As String()
Dim smaller As HashSet(Of String)
Dim larger As String()
If list1.Length < list2.Length Then
smaller = New HashSet(Of String)(list1)
larger = list2
Else
smaller = New HashSet(Of String)(list2)
larger = list1
End If
Dim intersection As New List(Of String)
For Each item As String In larger
If smaller.Contains(item) Then
intersection.Add(item)
End If
Next
Return intersection.ToArray()
End Function
回答2:
I'd expect you'd find that with different test cases you can reverse the results you had above and reach a situation where 2 is fastest and 1 & 3 are slower.
It's difficult to comment without knowing the make-up of the test case, it'll depend on the location of 'intersecting' items within the two arrays - if these tend to be close the to front on one array and closer to the end of another then the nesting order of the array iteration / IndexOf will have markedly different performance.
BTW - there are better ways of performing an intersection - sorting one or other array and performing a BinarySearch is one means - using a Dictionary(Of String, ...) or similar is another - and either would result in far better performance.
回答3:
This is from the MSDN documentation
Dim id1() As Integer = {44, 26, 92, 30, 71, 38}
Dim id2() As Integer = {39, 59, 83, 47, 26, 4, 30}
' Find the set intersection of the two arrays.
Dim intersection As IEnumerable(Of Integer) = id1.Intersect(id2)
For Each id As Integer In intersection
Debug.WriteLine(id.ToString)
Next
来源:https://stackoverflow.com/questions/3672328/array-intersection-function-speed