问题
There are two set A and B, and the size of both sets is n. How to find every elements of A that is not in B (A-B), with O(n). What data structure should I use (bloom filter?)
回答1:
Given that both are sets, you should use a set / hashset. This will let you compute the contains / in operation in O(1)
. Bloom filters aren't good for this type of problem - they tell you if an element definitely isn't in a set of objects, but there are still chances for false positives. You're better off using a regular hashset since you want an exact answer.
Given two sets you can compute the set difference in O(min(|A|, |B|))
.
If A is the smaller set you can loop through all elements in A and discard the ones that are present in B.
If B is the smaller set you can loop through all the elements in B and discard (from set A) any one you find in A.
回答2:
Here is one way to calculate the set difference in O(n) time complexity (and O(n) space complexity) without using a fancy data structure other then a set. I assume that the sets A and B have the ability to test for membership in O(1) time (as is typical of most HashSet implementations). This algorithm also does not require sets A and B to be modified.
Algorithm Pseudocode
Goal: Calculate (A-B)
Input: Set A, Set B;
BEGIN:
Create Empty Set C to contain (A-B).
for each element a in Set A:
if a does not exist in Set B:
Add a to Set C;
Return Set C;
END;
Time Complexity:
This runs in O(n) time complexity because you only have to iterate through all n elements of Set A once. And for each of the n elements, you test Set B for membership in O(1) time. That yields an O(n) runtime for the algorithm.
Space Complexity:
The space complexity is O(n) because a new set C is used that will store up to all n elements in the solution.
Java Sample Implementation
import java.util.HashSet;
public class Tester {
public static HashSet<String> setDifference(HashSet<String> A, HashSet<String> B) {
HashSet<String> C = new HashSet<String>();
for (String element : A) {
if (!B.contains(element)) {
C.add(element);
}
}
return C;
}
public static void main (String[] args) {
HashSet<String> A = new HashSet<String>();
HashSet<String> B = new HashSet<String>();
A.add("X");
A.add("Y");
A.add("Z");
B.add("X");
B.add("Y");
HashSet<String> C = setDifference(A, B);
// Set should only contain the element "Z"
System.out.println(C);
}
}
来源:https://stackoverflow.com/questions/54641552/an-algorithm-to-find-the-difference-of-two-set-a-and-b-with-size-n