问题
Warning, this is gonna be long since I want to be as specific as I can be.
Exact problem: This is a multi-processing problem. I have ensured that my classes all behave as built/expected in previous experiments.
edit: said threading beforehand.
When I run toy example of my problem in a threaded environment, everything behaves; however, when I transition into my real problem, the code breaks. Specifically, I get a TypeError: can't pickle _thread.lock objects
error. Full stack is at the bottom.
My threading needs here are bit different than the example I adapted my code from -- https://github.com/CMA-ES/pycma/issues/31. In this example we have one fitness function that can be independently called by each evaluation and none of the function calls can interact with each other. However, in my real problem we are trying to optimize neural network weights using a genetic algorithm. The GA will suggest potential weights and we need to evaluate these NN controller-weights in our environment. In a single threaded case, we can have just one environment where we evaluate the weights with a simple for-loop: [nn.evaluate(weights) for weights in potential_candidates]
, find the best-performing individual, and use those weights in the next mutation round. However, we cannot simply have one simulation in a threaded environment.
So, instead of passing in a single function to evaluate I am passing in a list of function (one for each individual, where the environment is the same, but we have forked the processes so that the communication streams don't interact between individuals.)
One further thing of immediate note: I am using a build-for-parallel evaluation data-structure from neat
from neat.parallel import ParallelEvaluator # uses multiprocessing.Pool
Toy example code:
NPARAMS = nn.flat_init_weights.shape[0] # make this a 1000-dimensional problem.
NPOPULATION = 5 # use population size of 5.
MAX_ITERATION = 100 # run each solver for 100 function calls.
import time
from neat.parallel import ParallelEvaluator # uses multiprocessing.Pool
import cma
def fitness(x):
time.sleep(0.1)
return sum(x**2)
# # serial evaluation of all solutions
# def serial_evals(X, f=fitness, args=()):
# return [f(x, *args) for x in X]
# parallel evaluation of all solutions
def _evaluate2(self, weights, *args):
"""redefine evaluate without the dependencies on neat-internal data structures
"""
jobs = []
for i, w in enumerate(weights):
jobs.append(self.pool.apply_async(self.eval_function[i], (w, ) + args))
return [job.get() for job in jobs]
ParallelEvaluator.evaluate2 = _evaluate2
parallel_eval = ParallelEvaluator(12, [fitness]*NPOPULATION)
# time both
for eval_all in [parallel_eval.evaluate2]:
es = cma.CMAEvolutionStrategy(NPARAMS * [1], 1, {'maxiter': MAX_ITERATION,
'popsize': NPOPULATION})
es.disp_annotation()
while not es.stop():
X = es.ask()
es.tell(X, eval_all(X))
es.disp()
Necessary background:
When I switch from the toy example to my real code, the above fails.
My classes are:
LevelGenerator (simple GA class that implements mutate, etc)
GridGame (OpenAI wrapper; launches a Java server in which to run the simulation;
handles all communication between the Agent and the environment)
Agent (neural-network class, has an evaluate fn which uses the NN to play a single rollout)
Objective (handles serializing/de-serializing weights: numpy <--> torch; launching the evaluate function)
# The classes get composed to get the necessary behavior:
env = GridGame(Generator)
agent = NNAgent(env) # NNAgent is a subclass of (Random) Agent)
obj = PyTorchObjective(agent)
# My code normally all interacts like this in the single-threaded case:
def test_solver(solver): # Solver: CMA-ES, Differential Evolution, EvolutionStrategy, etc
history = []
for j in range(MAX_ITERATION):
solutions = solver.ask() #2d-numpy array. (POPSIZE x NPARAMS)
fitness_list = np.zeros(solver.popsize)
for i in range(solver.popsize):
fitness_list[i] = obj.function(solutions[i], len(solutions[i]))
solver.tell(fitness_list)
result = solver.result() # first element is the best solution, second element is the best fitness
history.append(result[1])
scores[j] = fitness_list
return history, result
So, when I attempt to run:
NPARAMS = nn.flat_init_weights.shape[0]
NPOPULATION = 5
MAX_ITERATION = 100
_x = NNAgent(GridGame(Generator))
gyms = [_x.mutate(0.0) for _ in range(NPOPULATION)]
objs = [PyTorchObjective(a) for a in gyms]
def evaluate(objective, weights):
return objective.fun(weights, len(weights))
import time
from neat.parallel import ParallelEvaluator # uses multiprocessing.Pool
import cma
def fitness(agent):
return agent.evalute()
# # serial evaluation of all solutions
# def serial_evals(X, f=fitness, args=()):
# return [f(x, *args) for x in X]
# parallel evaluation of all solutions
def _evaluate2(self, X, *args):
"""redefine evaluate without the dependencies on neat-internal data structures
"""
jobs = []
for i, x in enumerate(X):
jobs.append(self.pool.apply_async(self.eval_function[i], (x, ) + args))
return [job.get() for job in jobs]
ParallelEvaluator.evaluate2 = _evaluate2
parallel_eval = ParallelEvaluator(12, [obj.fun for obj in objs])
# obj.fun takes in the candidate weights, loads them into the NN, and then evaluates the NN in the environment.
# time both
for eval_all in [parallel_eval.evaluate2]:
es = cma.CMAEvolutionStrategy(NPARAMS * [1], 1, {'maxiter': MAX_ITERATION,
'popsize': NPOPULATION})
es.disp_annotation()
while not es.stop():
X = es.ask()
es.tell(X, eval_all(X, NPARAMS))
es.disp()
I get the following error:
TypeError Traceback (most recent call last)
<ipython-input-57-3e6b7bf6f83a> in <module>
6 while not es.stop():
7 X = es.ask()
----> 8 es.tell(X, eval_all(X, NPARAMS))
9 es.disp()
<ipython-input-55-2182743d6306> in _evaluate2(self, X, *args)
14 jobs.append(self.pool.apply_async(self.eval_function[i], (x, ) + args))
15
---> 16 return [job.get() for job in jobs]
<ipython-input-55-2182743d6306> in <listcomp>(.0)
14 jobs.append(self.pool.apply_async(self.eval_function[i], (x, ) + args))
15
---> 16 return [job.get() for job in jobs]
~/miniconda3/envs/thesis/lib/python3.7/multiprocessing/pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):
~/miniconda3/envs/thesis/lib/python3.7/multiprocessing/pool.py in _handle_tasks(taskqueue, put, outqueue, pool, cache)
429 break
430 try:
--> 431 put(task)
432 except Exception as e:
433 job, idx = task[:2]
~/miniconda3/envs/thesis/lib/python3.7/multiprocessing/connection.py in send(self, obj)
204 self._check_closed()
205 self._check_writable()
--> 206 self._send_bytes(_ForkingPickler.dumps(obj))
207
208 def recv_bytes(self, maxlength=None):
~/miniconda3/envs/thesis/lib/python3.7/multiprocessing/reduction.py in dumps(cls, obj, protocol)
49 def dumps(cls, obj, protocol=None):
50 buf = io.BytesIO()
---> 51 cls(buf, protocol).dump(obj)
52 return buf.getbuffer()
53
TypeError: can't pickle _thread.lock objects
I also read here that this might be being caused by the fact that this is a class function -- TypeError: can't pickle _thread.lock objects -- so I created the global scoped fitness function def fitness(agent): return agent.evalute()
, but that didn't work either.
I thought this error might be coming from the fact that originally, I had the evaluate function in the PyTorchObjective class as a lambda function, but when I changed that it still broke.
Any insight would be greatly appreciated, and thanks for reading this giant wall of text.
回答1:
You are not using multiple threads. You are using multiple processes.
All arguments that you pass to apply_async
, including the function itself, are serialized (pickled) under the hood and passed to a worker process via an IPC channel (read up multiprocessing documentation for details). So you cannot pass any entities that are tied to things that are by their nature process-local. This includes most synchronization primitives since they have to use locks to do atomic operations.
Whenever this happens (as many other questions on this error message show), you are likely trying to be too smart and passing to a parallelization framework an object that already has parallelization logic built in.
If you want to create "multiple levels of parallelization" with such "parallelized object", you'll be better off either:
- using the parallelization mechanism of that object proper and not bother about multiple levels: you can't do more stuff at a time than you have cores anyway; or
- create and use these "parallelized objects" inside worker processes
- but you are likely to hit
multiprocessing
limitations here since its worker processes are deliberately prohibited from spawning their own pools.- You can let workers add extra items to the work queue but may hit Queue limitations as well.
- so for such a scenario, a more advanced 3rd-party distributed work queue solution may be preferrable.
- but you are likely to hit
来源:https://stackoverflow.com/questions/59441355/python-multiprocessing-pools-interaction-with-a-class-objective-function-and-ne