I have a problem with finding a algorithm for sorting a dataset of people. I try to explain as detailed as possible:
The story starts with a survey. A bunch of people, l
You can solve this optimally by formulating it as a min cost network flow problem.
Add a node for each person, and one for each project.
Set cost for a flow between a person and a project according to their preferences.
(As Networkx provides a min cost flow, but not max cost flow I have set the costs to be negative.)
For example, using Networkx and Python:
import networkx as nx
G=nx.DiGraph()
prefs={'Tom':['Project1','Project2','Project3'],
'Dick':['Project2','Project1','Project3'],
'Harry':['Project1','Project3','Project1']}
capacities={'Project1':2,'Project2':10,'Project3':4}
num_persons=len(prefs)
G.add_node('dest',demand=num_persons)
A=[]
for person,projectlist in prefs.items():
G.add_node(person,demand=-1)
for i,project in enumerate(projectlist):
if i==0:
cost=-100 # happy to assign first choices
elif i==1:
cost=-60 # slightly unhappy to assign second choices
else:
cost=-30 # very unhappy to assign third choices
G.add_edge(person,project,capacity=1,weight=cost) # Edge taken if person does this project
for project,c in capacities.items():
G.add_edge(project,'dest',capacity=c,weight=0)
flowdict = nx.min_cost_flow(G)
for person in prefs:
for project,flow in flowdict[person].items():
if flow:
print person,'joins',project
In this code Tom's number 1 choice is Project1, followed by Project2, then Project3.
The capacities dictionary specifies the upper limit on how many people can join each project.
My algorithm would be something like this:
mainloop
wishlevel = 1
loop
Distribute people into all projects according to wishlevel wish
loop through projects, counting population
If population exceeds maximum
Distribute excess non-redistributed people into their wishlevel+1 projects that are under-populated
tag distributed people as 'redistributed' to avoid moving again
endif
endloop
wishlevel = wishlevel + 1
loop until wishlevel == 3
mainloop until no project exceeds max population
This should make several passes through the data set until everything is evened out. This algorithm may result in an endless loop if you restrict redistribution of already-redistributed people in the event that one project fills up with such people as the algorithm progresses, so you might try it without that restriction.