BinarySearch for all occurrences?

空扰寡人 提交于 2019-12-11 11:15:12

问题


How can I search for all occurrences of a value in an array using BinarySearch? The default TArray.BinarySearch in System.Generics.Collections only returns one index.

Example Array:

A = [1, 2, 3, 3, 3, 6, 7, 8, 9];


回答1:


A binary search assumes you already have your array sorted, so any other matching elements would be clustered around the matching element returned by BinarySearch. The Delphi XE5 helps notes that

If there is more than one element in the array equal to Item, the index of the first match is returned in FoundIndex. This is the index of any of the matching items, not necessarily of the first item."

This suggests that you'll need to run a search both forward and backward in the array to get all matching elements.




回答2:


Let me explain the problem a bit more to you. The difference between a sequential search and a binary search once you have found an index depends on the type of data you expect to find. The 10000 elements is not relevant, how many different values of the item you are searching for is. For example, if I had a list of 10000 elements consisting of only 1,2,3,4 and 5. We are talking about a situation where there could be thousands of each value and a series of subsequent binary searches would be preferable. If the values could range from 1 to 1000000, we are far less likely to have duplicates and a binary search followed by a sequential search in both directions is the best approach.

For the binary and then sequential approach, the algorithm to find the start and end index would be the following:

  1. Find the index using a binary search.
  2. Search left to find the first index using a sequential search.
  3. Search right to find the last index using a sequential search.

If you wanted to use binary searches then you would need to switch your approach and do a series of recursive searches until you find the start and finish.

  1. Find the index using a binary search.
  2. Binary search 1..(index-1) for the value.
  3. If you find the value then you will need to search again between 1 and newindex-1.
  4. You will need to repeat this search until you don't find the value any more.
  5. Binary search (index+1)..end for the value.
  6. If you find the value then you will need to search again between newindex+1 and end.
  7. You will need to repeat this search until you don't find the value any more.

A code example would look a bit like this. This code is for a binary search that exits when it first finds a match.

function GetIndexes(const aSearch: TSearchIntegers; const aValue: Integer; var aStartIndex, aEndIndex: Integer): Boolean;
var
  foundIndex: Integer;
  lookFor: Integer;
begin
  if BinarySearch(aSearch, aValue, foundIndex) then
  begin
    Result := True;
    lookFor := foundIndex;
    repeat
      aStartIndex := lookFor;
    until not BinarySearch(aSearch, aValue, lookFor, TComparer<Integer>.Default, 1, aStartIndex - 1);
    lookFor := foundIndex;
    repeat
      aEndIndex := lookFor;
    until not BinarySearch(aSearch, aValue, lookFor, TComparer<Integer>.Default, aEndIndex + 1, High(aSearch) - aEndIndex);
  end
  else
    Result := False;
end;

Ultimately, your data (which we don't have) will determine the best course of action for you.

Now to complicate things a bit. The variation of the binary search that Delphi is using in TArray.BinarySearch is one that doesn't end early when a match is found. It will always find the index of the first item as it doesn't exit the loop when it finds a match.

Result := False;
L := Index;
H := Index + Count - 1;
while L <= H do
begin
  mid := L + (H - L) shr 1;
  cmp := Comparer.Compare(Values[mid], Item);
  if cmp < 0 then
    L := mid + 1
  else
  begin
    H := mid - 1;
    if cmp = 0 then
      Result := True;  // <-- It doesn't end here
  end;
end;

That means that you have a bit of a penalty when you have a lot of identical values but it does have a nice side effect. You can do something like this to find what you are looking for:

function GetIndexes(const aSearch: TSearchIntegers; const aValue: Integer; var aStartIndex, aEndIndex: Integer): Boolean;
begin
  Result := False;
  if TArray.BinarySearch<Integer>(aSearch, aValue, aStartIndex) then
  begin
    TArray.BinarySearch<Integer>(aSearch, aValue+1, aEndIndex);
    if aSearch[aEndIndex] <> aValue then
      Dec(aEndIndex);
    Result := True;
  end;
end;

This works because the search also returns the index of the next value even if it doesn't find aValue + 1 in the array. The if statement at the end is to handle the case when our value is also the last value of the array.

This is dependent on the code for TArray.BinarySearch remaining as it is.



来源:https://stackoverflow.com/questions/24584992/binarysearch-for-all-occurrences

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!