Sorting People into Groups based on Votes

后端 未结 2 415
我寻月下人不归
我寻月下人不归 2021-01-21 09:35

I have a problem with finding a algorithm for sorting a dataset of people. I try to explain as detailed as possible:

The story starts with a survey. A bunch of people, l

相关标签:
2条回答
  • 2021-01-21 09:54

    You can solve this optimally by formulating it as a min cost network flow problem.

    Add a node for each person, and one for each project.

    Set cost for a flow between a person and a project according to their preferences.

    (As Networkx provides a min cost flow, but not max cost flow I have set the costs to be negative.)

    For example, using Networkx and Python:

    import networkx as nx
    
    G=nx.DiGraph()
    
    prefs={'Tom':['Project1','Project2','Project3'],
           'Dick':['Project2','Project1','Project3'],
           'Harry':['Project1','Project3','Project1']}
    
    capacities={'Project1':2,'Project2':10,'Project3':4}
    
    num_persons=len(prefs)
    G.add_node('dest',demand=num_persons)
    A=[]
    for person,projectlist in prefs.items():
        G.add_node(person,demand=-1)
        for i,project in enumerate(projectlist):
            if i==0:
                cost=-100 # happy to assign first choices
            elif i==1:
                cost=-60 # slightly unhappy to assign second choices
            else:
                cost=-30 # very unhappy to assign third choices
            G.add_edge(person,project,capacity=1,weight=cost) # Edge taken if person does this project
    
    for project,c in capacities.items():
            G.add_edge(project,'dest',capacity=c,weight=0)
    
    flowdict = nx.min_cost_flow(G)
    for person in prefs:
        for project,flow in flowdict[person].items():
            if flow:
                print person,'joins',project
    

    In this code Tom's number 1 choice is Project1, followed by Project2, then Project3.

    The capacities dictionary specifies the upper limit on how many people can join each project.

    0 讨论(0)
  • 2021-01-21 10:08

    My algorithm would be something like this:

    mainloop
     wishlevel = 1
      loop
       Distribute people into all projects according to wishlevel wish
       loop through projects, counting population
        If population exceeds maximum
         Distribute excess non-redistributed people into their wishlevel+1 projects that are under-populated
         tag distributed people as 'redistributed' to avoid moving again
        endif
       endloop
      wishlevel = wishlevel + 1
     loop until wishlevel == 3    
    mainloop until no project exceeds max population
    

    This should make several passes through the data set until everything is evened out. This algorithm may result in an endless loop if you restrict redistribution of already-redistributed people in the event that one project fills up with such people as the algorithm progresses, so you might try it without that restriction.

    0 讨论(0)
提交回复
热议问题