问题
How can I implement a bipartite matching algorithm (probably based on a max-flow algorithm) in C or C++ ?
To be specific, I have this input in a file: (1,3) (1,5) (2,5)
(M,F) --> where M represents id of MALE and F is id of FEMALE.
I need to find the maximum number of matches and show matched couples. Like: matches: 1&3 , 2&5
I have read in some books I can base this problem on a "maximum flow in a network" algorithm, but I couldn't find any specific info other than the sentence "this problem can be solved by .... algorithm". I have little knowledge about max-flow, and dont know how to implement it either...
回答1:
Yes, bipartite matching can be reduced to maximum flow:
You're given sets of nodes
M
andF
. Add a directed edge from a nodem
inM
to a nodef
inF
if you've got the pair(m, f)
in your file.Add a single node
S
with a directed edge fromS
to every node inM
(this is your "super-source" node).Add a single node
T
with a directed edge from every node inF
toT
(this is your "super-sink" node).Now, you need to find the maximum flow (with all your edges of weight 1) from
S
toT
.
So what the heck is maximum flow? A flow from S
to T
is a set of edges such that for each node (except S
and T
), the weight of its in-flux edges is the same as the weight of its out-flux edges. Imagine that your graph is a series of pipes, and you're pouring water in the system at S
and letting it out at T
. At every node in between, the amount of water going in has to be the same as the amount of water coming out.
Try to convince yourself that a flow corresponds to a matching of your original sets. (Given a flow, how to you get a matching? Given a matching, how to you get a flow?)
Finally, to find the maximum flow in a graph, you can use the Ford-Fulkerson algorithm. The above wikipedia page gives a good description of it, with pseudo-code.
回答2:
Yes, if you already have code to solve the max-flow problem, you can use it to solve bipartite matching by transforming the graph as shown towards the end of this lecture, but that's probably not the right approach if you are starting from scratch. If you just want to implement some fairly simple code to solve the problem for examples that don't get too huge, you are better off using a simple augmenting path approach as outlined here. That gives you an O(|V||E|) approach that is pretty easy to code and adequate for all but very large graphs. If you want something with better worst-case performance, you could try the Hopcraft-Karp algorithm, which finds multiple augmenting paths at once and has a O(sqrt(|V|)|E|) run time bound, but the Wikipedia article on it notes that:
Several authors have performed experimental comparisons of bipartite matching algorithms. Their results in general tend to show that the Hopcroft–Karp method is not as good in practice as it is in theory: it is outperformed both by simpler breadth-first and depth-first strategies for finding augmenting paths, and by push-relabel techniques.
In any case, you should definitely understand and be able to implement a simple augmenting-path approach before trying to tackle either Hopcraft-Karp or one of the push-relable techniques mentioned in the references of the Wikipedia article.
Edit: For some reason, the links above aren't showing up correctly. Here are the URLs in question: (http://oucsace.cs.ohiou.edu/~razvan/courses/cs404/lecture21.pdf), (http://www.maths.lse.ac.uk/Courses/MA314/matching.pdf), and (http://en.wikipedia.org/wiki/Hopcroft–Karp_algorithm).
回答3:
The QuickGraph library includes a bipartite matching algorithm, which I just worked on and checked in a fix for. It wraps the Edmonds Karp maximum flow algorithm.
The only documentation for the algorithm so far is the unit tests I added. If anyone would like to add a (hopefully faster) implementation which does not simply wrap a maxflow algorithm, please contact me.
回答4:
Here is an experimental study of flow algorithms for maximum bipartite matching:
@article{cherkassky98,
author = {Boris V. Cherkassky and Andrew V. Goldberg and Paul Martin and Joao C. Setubal and Jorge Stolfi},
title = {Augment or Push: A Computational Study of Bipartite Matching and Unit Capacity Flow Algorithms},
journal = {Journal of Experimental Algorithmics},
volume = 3,
number = 8,
year = 1998
}
The winner was a push-relabel algorithm, which I believe was the implementation from Andrew Goldberg's "BIM" package, which you can download here:
http://www.avglab.com/andrew/soft.html
Mind you, if it's important that you code up the solution yourself, you might want to settle for Ford-Fulkerson, as Jesse suggested. If you do that, I recommend you use breadth-first search, not depth-first search, to find the augmenting path (for reasons explained in the article above).
回答5:
#include<stdio.h>
#include<conio.h>
void main()
{
int m,n,x,y,i,j,i1,j1,maxvalue;
float s[10][10] = {0,0};
int s2[10][10] = {0,0};
float f[20][20] = {0,0};
float f1[20][20] = {0,0};
float f2[20][20] = {0,0};
printf("\nEnter Number of Jobs(rows) and Machines(columns) of Matrix:\n");
scanf_s("%d%d",&m,&n);
printf("\nEnter the Pij elements of matrix:\n");
for(x=1;x<m+1;++x)
for(y=1;y<n+1;++y)
scanf("%f", &s[x][y]);
//Find sum of each row
for(x=1;x<m+1;++x)
{
s[x][n+1]=0;
for(y=1;y<n+1;++y)
s[x][n+1]=s[x][n+1]+s[x][y];
//Find sum of each column
for(y=1;y<n+1;++y)
{
s[m+1][y]=0;
for(x=1;x<m+1;++x)
s[m+1][y]+=s[x][y];
}
printf("\nMatrix s, Row Sum (Last Column) and Column Sum (Last Row) : \n");
printf("\ns:\n");
for(x=1;x<m+2;++x)
{
for(y=1;y<n+2;++y)
printf(" %2.0f " , s[x][y]);
printf("\n");
}
//Print sum of each column
/*x=n+1;
for(y=1;y<m+1;++y)
printf(" %2.0f " , s[x][y]);*/
printf("\n");
maxvalue = s[1][1];
for(x=1; x<m+2; ++x)
for(y=1; y<n+2; ++y)
{
if(maxvalue < s[x+1][y+1])
maxvalue = s[x+1][y+1];
}
printf("\n");
printf("maxvalue = %d" , maxvalue);
printf("\nd1:\n");
float d1[20][20] = {0,0};
for(i=1;i<=m;++i)
{
for(j=1;j<=m;++j)
{
if(i==j)
d1[i][j] = maxvalue - s[i][n+1];
printf(" %2.0f " , d1[i][j]);
}
printf("\n");
}
printf("\nd2\n");
float d2[20][20] = {0,0};
for(i=1;i<=n;++i)
{
for(j=1;j<=n;++j)
{
if(i==j)
d2[i][j] = maxvalue - s[m+1][j];
printf(" %2.0f " , d2[i][j]);
}
printf("\n");
}
//row diff:
printf("\n\nRow diff:\n");
float r[20]= {0};
for(i=1;i<=n;i++)
for(j=1;j<=n;j++)
{
if(i == j)
{
r[i] = maxvalue - d2[i][j];
printf("%f ",r[i]);
}
}
//col diff:
printf("\n\nCol diff:\n");
float c[20]= {0};
for(i=1;i<=m;i++)
for(j=1;j<=m;j++)
{
if(i == j)
{
c[i] = maxvalue - d1[i][j];
printf("%f ",c[i]);
}
}
//assignment matrix:
float am[20][20]={0};
i=j=1;
ITERATION1:
if((c[i]<r[j]) && i<=m && j<=n)
{
am[j][i]=c[i];
r[j]=r[j]-c[i];
c[i]=0;
i++;
}
else if((c[i]>r[j]) && i<=m && j<=n)
{
am[j][i]=r[j];
c[i]=c[i]-r[j];
r[j]=0;
j++;
}
else if((c[i]==r[j]) && i<=m && j<=n)
{
am[j][i]=r[j];
c[i]=r[j]=0;
i++;j++;
}
else
goto END;
for(int z=0;z<=n;z++)
{
if(c[z]==0)
continue;
else
goto ITERATION1;
}
for(int b=0;b<=m;b++)
{
if(r[b]==0)
continue;
else
goto ITERATION1;
}
END:
printf("\n\nASSIGNMENT MATRIX:\n");
for(i=1;i<=n;i++)
{
for(j=1;j<=m;j++)
{
printf(" %2.0f ",am[i][j]);
}
printf("\n");
}
printf("\n\nf:\n");
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
if((i<=m) && (j<=n))
{
f[i][j]=s[i][j];
}
if((i<=m)&&(j>n))
{
f[i][j] = d1[i][j-n];
}
if((i>m)&&(j<=n))
{
f[i][j] = d2[i-m][j];
}
if((i>m)&&(j>n))
{
f[i][j] = am[i-m][j-n];
}
printf(" %2.0f " , f[i][j]);
}
printf("\n");
}
//printf("\n\nf1:\n");
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
f1[i][j]=f[i][j];
//printf(" %2.0f " , f1[i][j]);
}
//printf("\n");
}
int cnt = 0;
ITERATION2:
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
f2[i][j] = -1;
}
}
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
if(f1[i][j]!=0 && f2[i][j]!=0)
{
f2[i][j] = f1[i][j];
for(j1=j+1;j1<(m+n)+1;++j1)
f2[i][j1] = 0;
for(i1=i+1;i1<(m+n)+1;++i1)
f2[i1][j] = 0;
}
}
}
//printf("\n\nf2:\n");
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
if(f2[i][j] == -1)
{
f2[i][j] = 0;
}
//printf(" %2.0f " , f2[i][j]);
}
//printf("\n");
}
//printf("\n\nf1:\n");
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
if(f2[i][j] != 0)
{
f1[i][j] = f1[i][j] - 1;
}
//printf(" %2.0f " , f1[i][j]);
}
//printf("\n");
}
cnt++;
printf("\nITERATION - %d", cnt);
printf("\n\Gant Chart:\n");
for(i=1; i<=m;++i)
{
for(j=1;j<=n;++j)
{
if(f2[i][j] != 0)
{
s2[i][cnt] = j;
printf(" J%d -> M%d", i,j);
}
}
printf("\n");
}
int sum = 1;
for(i=1; i<(m+n)+1;++i)
{
for(j=1;j<(m+n)+1;++j)
{
sum = sum + f1[i][j];
}
}
if(sum>1)
goto ITERATION2;
else
goto END2;
END2:
printf("\n\Final Gant Chart:\n");
for(i=1; i<=m;++i)
{
for(j=0;j<=cnt;++j)
{
if(j == 0 )
printf(" J%d -> ", i);
else
{
if(s2[i][j] !=0)
printf(" M%d ", s2[i][j]);
else
printf(" %2c ", ' ');
}
}
printf("\n");
}
getch();
}
来源:https://stackoverflow.com/questions/878668/bipartite-matching