Run through large generator iterable on GPU

问题

I recently received help with optimizing my code to use generators to save on memory while running code that needs to check many permutations. To put it in perspective, I believe the generator is iterating over a list that has 2! * 2! * 4! * 2! * 2! * 8! * 4! * 10! elements in it. Unfortunately, while I now no longer run out of memory generating the permutations, it is taking >24 hours to run my code. Is it possible to parallelize this through GPU?

Generating the iterator with all the above permutations only takes about 1 second, it is trying to iterate through the list that slows it down.

What the code is trying to do is find the permutation that minimizes a specific function (variation of stable marriage).

Each permutation is a list of names in a specific order. There is a separate master list of jobs. Each name in the list has ranked those jobs in order of preference. The algorithm will iterate through each name in the list, and give them their top ranked job if it is not taken by anyone before them. The goal is to minimize the average pick ranking across the candidates in the list (aka everyone optimally gets their first choice of job). My code currently runs in O(n^2) time. I'm not sure if it can be optimized better, but ideally I'd like to have the code finish in <24 hours and parallelizing it on a GPU might be the way to go.

This is the code that generates the permutations. T= ~2 seconds. After it generates them, it runs the min function that finds the best result (the permutation with the lowest average candidate rankings).

groups = itertools.groupby(data,operator.itemgetter(1))
permutations = map(itertools.permutations, map(operator.itemgetter(1), groups))
results = map(list, map(itertools.chain.from_iterable, itertools.product(*permutations)))
best = min(results, key=gen_ranking_score)

def gen_ranking_score(choice_order):
    #there are 40 roles to choose from
    roles_temp = list(range(1,41))
    candidate_assignment = {}
    candidate_role_assignment = {}
    for candidate in choice_order:
        candidate = candidate[0]
        #candidate_rankings is a dict initialized in beginning that has the job preferences in order for each candidate.
        #iterates sequentially through the candidate's rankings. If the job is available, assign it to the candidate and remove it from the list of available jobs.
        for rank_index in range(1,len(candidate_rankings[candidate])):
            if (candidate_rankings[candidate][rank_index] in roles_temp):
                candidate_assignment[candidate] = rank_index+1
                roles_temp.remove(candidate_rankings[candidate][rank_index])
                break
    return (statistics.mean(candidate_assignment.values()))

I feel like I could take this code and split the assignments across multiple cores, which a GPU is optimal for. Is this possible with Python?

来源：https://stackoverflow.com/questions/48890902/run-through-large-generator-iterable-on-gpu

标签

python

optimization

parallel-processing

gpu

permutation