问题
I started ray on a terminal in an environment called p_c which has pandas installed with the command ray start --head --num-cpus=2 --num-gpus=0
Then, I ran the following python script:
import ray
import os
import pandas as pd
import sys
ray.init(address='auto', redis_password='5241590000000000')
@ray.remote
def foo():
import pandas as pd
print("This runs on the VM")
print(os.getcwd())
print(sys.path)
data = pd.read_csv('/Documents/sample.data')
return 1
print("This runs locally")
print(ray.get(foo.remote()))
Running this raised the following error:
WARNING: Logging before InitGoogleLogging() is written to STDERR
I1014 13:56:23.410329 16563 16563 global_state_accessor.cc:25] Redis server address = 192.168.29.24:6379, is test flag = 0
I1014 13:56:23.411886 16563 16563 redis_client.cc:146] RedisClient connected.
I1014 13:56:23.421353 16563 16563 redis_gcs_client.cc:89] RedisGcsClient Connected.
I1014 13:56:23.423465 16563 16563 service_based_gcs_client.cc:193] Reconnected to GCS server: 192.168.29.24:37125
I1014 13:56:23.424247 16563 16563 service_based_accessor.cc:92] Reestablishing subscription for job info.
I1014 13:56:23.424291 16563 16563 service_based_accessor.cc:422] Reestablishing subscription for actor info.
I1014 13:56:23.424387 16563 16563 service_based_accessor.cc:797] Reestablishing subscription for node info.
I1014 13:56:23.424415 16563 16563 service_based_accessor.cc:1073] Reestablishing subscription for task info.
I1014 13:56:23.424441 16563 16563 service_based_accessor.cc:1248] Reestablishing subscription for object locations.
I1014 13:56:23.424466 16563 16563 service_based_accessor.cc:1368] Reestablishing subscription for worker failures.
I1014 13:56:23.424504 16563 16563 service_based_gcs_client.cc:86] ServiceBasedGcsClient Connected.
This runs locally
Traceback (most recent call last):
File "hello1.py", line 26, in <module>
print(ray.get(foo.remote()))
File "/home/jatin/.local/lib/python3.8/site-packages/ray/worker.py", line 1538, in get
raise value.as_instanceof_cause()
ray.exceptions.RayTaskError(ModuleNotFoundError): ray::__main__.foo() (pid=16182, ip=192.168.29.24)
File "python/ray/_raylet.pyx", line 479, in ray._raylet.execute_task
File "hello1.py", line 17, in foo
import pandas as pd
ModuleNotFoundError: No module named 'pandas'
I have pandas installed at all the possible paths. I am unable to understand where exactly is the worker looking for pandas module that it is not finding it. Without the pandas import the code is running fine.
来源:https://stackoverflow.com/questions/64349519/error-in-ray-modulenotfounderror-no-module-named-pandas