This is a problem that could be done with some type of brute-force algorithm, but I was wondering if there are some efficient ways of doing this.
Let\'s assume that
Edit: This won't work as user98235 points out in the comments. I will leave this here however in case anyone else has the same idea.
I think this will work if I have understood the problem correctly.
Pseudocode:
resultList = new List of Pair
elementSet = new Set of int
for pair in inputPairs:
if not elementSet.Contains(pair.First) and not elementSet.Contains(pair.Second):
elementSet.Add(pair.First)
elementSet.Add(pair.Second)
resultList.Add(pair)
The time complexity will be O(N).
Edit: As my misunderstanding, the previous approach is not correct.
Here is my second attempt:
If we view each number in a pair as node in a graph, we can build a bipartite graph, with edges between each node if there is a pair containing these two nodes.
So, this problem is reduced to find the maximum bipartite matching, which can be solved using classic Ford-fulkerson algorithm.
First wrong approach:
We can solve this by using dynamic programming.
Sorting the pair by their starting point, (if draw, by their ending point).
Assume that we have a function f
, with f(i)
returns the maximum number of pairs, if we choose from pair i
onward.
If we select a pair i
, we need to check what is the next smallest index greater than i
that is not overlapping with i
.
We have
f(i) = max(1 + f(next index not overlap i), f (i + 1))
Storing the result of f(i)
in a table, we can have a solution with O(n^2) time complexity, and the result will be f(0)
.
Pseudo code:
sort data;//Assume we have a data array to store all pairs
int[] dp;
int f(int index){
if(index == data.length)
return 0;
if(we have calculated this before)
return dp[index];
int nxt = -1;
for(int i = index + 1; i < data.length; i++){
if(data[i].start > data[index].end){
nxt = i;
break;
}
}
if(nxt == -1)
return dp[index] = max(1, f(index + 1));
return dp[index] = max(1 + f(nxt) , f(index + 1));
}
Your question is equivalent to finding a maximum matching on a graph. The nodes of your graph are integers, and your pairs (a, b) are edges of the graph. A matching is a set of pairwise non-adjacent edges, which is equivalent to saying the same integer doesn't appear in two edges.
A polynomial time solution to this problem is the Blossom algorithm also known as Edmond's algorithm. It's rather too complicated to include the details in the answer here.
This problem can be recast in terms of graph theory. Nodes of the graph are the pairs you are given. Two nodes are connected if the pairs have a number in common. Your problem is to find a maximum independent set.
Maximum independent set is equivalent to finding a maximum clique, which is both NP-complete and "hard to approximate". However, in this particular case, the graphs are of a special type called "claw-free" (http://en.wikipedia.org/wiki/Claw-free_graph) because if a node is connected to three other nodes, at least two of those nodes must share a common number and so are themselves connected.
It turns out that for the special case of claw-free graphs, the maximum independent set problem can be solved in polynomial time: http://en.wikipedia.org/wiki/Claw-free_graph#Independent_sets.