问题
The abstract problem
I want to find the best maximum matching in a complete weighted bipartite graph where the two sets of vertices differ drastically in size, i.e. one set of vertices is very large and the other one very small.
The Hungarian algorithm is not a good approach for this problem since it adds dummy vertices to the smaller set such that the two sets have the same size, so I lose all the potential efficiency gains from one of the vertex sets being only very small.
More concretely
I have divided objects (bounding boxes) into two sets and I have a similarity measure (Jaccard overlap) for how similar any two objects are. I want to produce the matching between the two sets such that the sum of the similarities of all individual matches is maximal.
The problem is that one of the sets contains only very few objects, say 10, while the second set is very large, say 10,000 objects. Each of the 10 objects in the first set needs to be matched to one of the 10,000 objects in the second set.
This asymmetry in the sizes of the two sets is what makes me wonder how to do this efficiently. I can't use the Hungarian algorithm and produce a 10,000 by 10,000 matrix.
回答1:
Probably the easiest approach in terms of available software: use a min-cost network-flow solver. This formulation has no trouble with rectangular cost-matrices! The basic idea is simple and an intro is here (one slide shown in following image):
There is a lot of available software (e.g. Coin-OR Lemon/C++; Google's ortools/C++ with many wrappers).
Google's ortools also has an own documentation-entry on this.
Despite that, the book:
Burkard, Rainer E., Mauro Dell'Amico, and Silvano Martello. Assignment problems, revised reprint. Vol. 125. Siam, 2009.
has a tiny/small chapter (5.4.4 Rectangular cost matrix) outlining other approaches, mostly modifications of other linear-assignment algorithms.
Part of that chapter is the following:
Alternatively, one can use the transformation to a minimum cost flow problem of Section 4.4.1, which does not require that vertex sets U and V have equal cardinality.
来源:https://stackoverflow.com/questions/49697147/maximum-weighted-bipartite-matching-for-two-sets-of-vertices-of-drastically-diff