问题
I'm implementing SEED using ray, and therefore, I define a Worker
class as follows
import numpy as np
import gym
class Worker:
def __init__(self, worker_id, env_name, n):
import os
os.environ['OPENBLAS_NUM_THREADS'] = '1'
self._id = worker_id
self._n_envs = n
self._envs = [gym.make(env_name)
for _ in range(self._n_envs)]
def reset_env(self, env_id):
return self._envs[env_id].reset()
def env_step(self, env_id, action):
return self._envs[env_id].step(action)
Besides that, there is a loop in the Leaner
that invoke methods of Worker
when necessary to interact with the environment.
As this document suggests, I want to make sure each worker use exactly one CPU resource. Here's some of my attempts:
- When creating a
worker
, I setnum_cpus=1
:worker=ray.remote(num_cpus=1)(Worker).remote(...)
- I checked my numpy configuration using
np.__config__.show()
which gave me the following information
blas_mkl_info: NOT AVAILABLE
blis_info: NOT AVAILABLE
openblas_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]
blas_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info: NOT AVAILABLE
openblas_lapack_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info: libraries = ['openblas', 'openblas'] library_dirs = ['/usr/local/lib'] language = c define_macros = [('HAVE_CBLAS', None)]
I noticed that numpy is using OpenBLAS, so I set os.environ['OPENBLAS_NUM_THREADS'] = '1'
in the Worker
class as the above code does following this instruction.
After both are done, I opened top but still noticed that each Worker use 130%-180%
CPUs, exactly the same as before. I've also tried to set os.environ['OPENBLAS_NUM_THREADS'] = '1'
at the beginning of main python script or using export OPENBLAS_NUM_THREADS=1
, but nothing helps. What can I do now?
回答1:
You can pin your core at each worker. For example, you can use something like psutil.Process().cpu_affinity([i]) to pin an index i core at each worker.
Also, before you pin your cpu, make sure to know what cpu has been assigned to the worker by this api. https://github.com/ray-project/ray/blob/203c077895ac422b80e31f062d33eadb89e66768/python/ray/worker.py#L457
Example:
ray.init(num_cpus=4)
@ray.remote(num_cpus=1)
def f():
import numpy
resources = ray.ray.get_resource_ids()
cpus = [v[0] for v in resources['CPU']]
psutil.Process().cpu_affinity(cpus)
来源:https://stackoverflow.com/questions/61051911/how-to-ensure-each-worker-use-exactly-one-cpu