Re-assign column values in a pandas df

后端 未结 3 1046
傲寒
傲寒 2021-02-02 14:02

This question is related to rostering or staffing. I\'m trying to assign various jobs to individuals (employees). Using the df below,

`[Person]` =          


        
3条回答
  •  花落未央
    2021-02-02 14:18

    In writing my other answer, I slowly came around to the idea that the OP's algorithm might be easier to implement with an approach that focuses on the jobs (which can be different), instead of the people (which are all the same). Here's a solution that uses the job-centric approach:

    from collections import Counter
    import numpy as np
    import pandas as pd
    
    def assignJob(job, assignedix, areasPerPerson):
        for i in range(len(assignedix)):
            if (areasPerPerson - len(assignedix[i])) >= len(job):
                assignedix[i].extend(job)
                return True
        else:
            return False
    
    def allocatePeople(df, areasPerPerson=3):
        areas = df['Area'].values
        times = pd.to_datetime(df['Time']).values
        peopleUniq = df['Person'].unique()
        npeople = int(np.ceil(areas.size / float(areasPerPerson)))
    
        # search for repeated areas. Mark them if the next repeat occurs within an hour
        ixrep = np.argmax(np.triu(areas.reshape(-1, 1)==areas, k=1), axis=1)
        holds = np.zeros(areas.size, dtype=bool)
        holds[ixrep.nonzero()] = (times[ixrep[ixrep.nonzero()]] - times[ixrep.nonzero()]) < np.timedelta64(1, 'h')
    
        jobs =[]
        _jobdict = {}
        for i,(area,hold) in enumerate(zip(areas, holds)):
            if hold:
                _jobdict[area] = job = _jobdict.get(area, []) + [i]
                if len(job)==areasPerPerson:
                    jobs.append(_jobdict.pop(area))
            elif area in _jobdict:
                jobs.append(_jobdict.pop(area) + [i])
            else:
                jobs.append([i])
        jobs.sort()
    
        assignedix = [[] for i in range(npeople)]
        for job in jobs:
            if not assignJob(job, assignedix, areasPerPerson):
                # break the job up and try again
                for subjob in ([sj] for sj in job):
                    assignJob(subjob, assignedix, areasPerPerson)
    
        df = df.copy()
        for i,aix in enumerate(assignedix):
            df.loc[aix, 'Person'] = peopleUniq[i]
        return df
    

    This version of allocatePeople has also been extensively tested and passes all of the same checks described in my other answer.

    It does have more looping than my other solution, so it is likely to be slightly less efficient (though it'll only matter if your dataframe is very large, say 1e6 rows and up). On the other hand, it is somewhat shorter and, I think, more straightforward and easy to understand.

提交回复
热议问题