问题
I have been trying to improve my code (with numba and multiprocessing), but I cannot quite get it, because my function has a lot of arguments.
I have already simplified it with other functions (see below)...
As each agent (a class instance) is independent of each other for these actions, I would like to replace the for
with Pool
.
So I would get a large function pooling()
that I would call and pass the list of agents
from multiprocessing import Pool
p = Pool(4)
p.map(pooling, list(agents))
But, where do I ADD all the arguments that the pooling function will need?
As it is:
def check_demographics(month, my_agents, families, firms, year, mortality_men, mortality_women, fertility, state_id):
dummy = list(my_agents)
d = str(state_id.iloc[0])
# Place where I would like to replace the LOOP. All below would be a function
for agent in dummy:
if agent.get_region_id()[:2] == d:
# Brithday
if month % 12 == agent.month - 1:
agent.update_age()
# Mortality probability
if agent.get_gender() == 'Male':
prob = mortality_men[mortality_men['age'] == agent.get_age()][year].iloc[0]
# When gender is Female
else:
# Extract specific agent data to calculate mortality 'Female'
prob = mortality_women[mortality_women['age'] == agent.get_age()][year].iloc[0]
# Give birth decision
age = agent.get_age()
if 14 < age < 50:
pregnant(agent, fertility, year, families, my_agents)
# Mortality procedures
if fixed_seed.random() < prob:
mortal(my_agents, my_graveyard, families, agent, firms)
It is the most time consuming function in my programme.
And @jit
is not helping much.
Thanks a bunch
回答1:
Yes, there is a lot of parameters! Consider using a class.
Well, since Pool.map
support only one iterable argument, you need to group everything in one place. I suggest you to use the "Facade" pattern: an intermediate class used to store all required parameters and having a single method (I call it check
) without parameter (it's a method).
class Facade(object):
def __init__(self, agent, d, families, fertility, firms, month, mortality_men, mortality_women, my_agents,
my_graveyard, year):
self.agent = agent
self.d = d
self.families = families
self.fertility = fertility
self.firms = firms
self.month = month
self.mortality_men = mortality_men
self.mortality_women = mortality_women
self.my_agents = my_agents
self.my_graveyard = my_graveyard
self.year = year
def check(self):
(agent, d, families, fertility, firms,
month, mortality_men, mortality_women,
my_agents, my_graveyard, year) = (
self.agent, self.d, self.families, self.fertility, self.firms,
self.month, self.mortality_men, self.mortality_women,
self.my_agents, self.my_graveyard, self.year)
if agent.get_region_id()[:2] == d:
# Brithday
if month % 12 == agent.month - 1:
agent.update_age()
# Mortality probability
if agent.get_gender() == 'Male':
prob = mortality_men[mortality_men['age'] == agent.get_age()][year].iloc[0]
# When gender is Female
else:
# Extract specific agent data to calculate mortality 'Female'
prob = mortality_women[mortality_women['age'] == agent.get_age()][year].iloc[0]
# Give birth decision
age = agent.get_age()
if 14 < age < 50:
pregnant(agent, fertility, year, families, my_agents)
# Mortality procedures
if fixed_seed.random() < prob:
mortal(my_agents, my_graveyard, families, agent, firms)
Remark: my refactoring is really ugly, but I wanted to keep variable names unchanged for clarity.
Then your loop can be something like that:
def check_demographics(month, my_agents, families, firms,
year, mortality_men, mortality_women,
fertility, state_id, my_graveyard):
d = str(state_id.iloc[0])
pool = Pool(4)
facades = [Facade(agent, d, families, fertility, firms,
month, mortality_men, mortality_women,
my_agents, my_graveyard, year)
for agent in my_agents]
pool.map(Facade.check, facades)
You said that each agent is independent of each other but, after analysing the loop, I see that you need the complete list of agents (the my_agents
parameters). It's obvious with the Facade
class. So your agent list must not change and the internal state of each agent must be frozen during looping.
来源:https://stackoverflow.com/questions/38212265/implementing-pool-on-a-for-loop-with-a-lot-of-inputs