Error in Ray: “ModuleNotFoundError: No module named 'pandas' ”

帅比萌擦擦* 提交于 2020-12-13 04:47:07

问题


I started ray on a terminal in an environment called p_c which has pandas installed with the command ray start --head --num-cpus=2 --num-gpus=0

Then, I ran the following python script:

import ray
import os
import pandas as pd
import sys

ray.init(address='auto', redis_password='5241590000000000')

@ray.remote
def foo():
    import pandas as pd
    print("This runs on the VM")
    print(os.getcwd())
    print(sys.path)
    data = pd.read_csv('/Documents/sample.data')
    
    return 1

print("This runs locally")
print(ray.get(foo.remote()))

Running this raised the following error:

WARNING: Logging before InitGoogleLogging() is written to STDERR
    I1014 13:56:23.410329 16563 16563 global_state_accessor.cc:25] Redis server address = 192.168.29.24:6379, is test flag = 0
    I1014 13:56:23.411886 16563 16563 redis_client.cc:146] RedisClient connected.
    I1014 13:56:23.421353 16563 16563 redis_gcs_client.cc:89] RedisGcsClient Connected.
    I1014 13:56:23.423465 16563 16563 service_based_gcs_client.cc:193] Reconnected to GCS server: 192.168.29.24:37125
    I1014 13:56:23.424247 16563 16563 service_based_accessor.cc:92] Reestablishing subscription for job info.
    I1014 13:56:23.424291 16563 16563 service_based_accessor.cc:422] Reestablishing subscription for actor info.
    I1014 13:56:23.424387 16563 16563 service_based_accessor.cc:797] Reestablishing subscription for node info.
    I1014 13:56:23.424415 16563 16563 service_based_accessor.cc:1073] Reestablishing subscription for task info.
    I1014 13:56:23.424441 16563 16563 service_based_accessor.cc:1248] Reestablishing subscription for object locations.
    I1014 13:56:23.424466 16563 16563 service_based_accessor.cc:1368] Reestablishing subscription for worker failures.
    I1014 13:56:23.424504 16563 16563 service_based_gcs_client.cc:86] ServiceBasedGcsClient Connected.
    This runs locally
    Traceback (most recent call last):
      File "hello1.py", line 26, in <module>
        print(ray.get(foo.remote()))
      File "/home/jatin/.local/lib/python3.8/site-packages/ray/worker.py", line 1538, in get
        raise value.as_instanceof_cause()
    ray.exceptions.RayTaskError(ModuleNotFoundError): ray::__main__.foo() (pid=16182, ip=192.168.29.24)
      File "python/ray/_raylet.pyx", line 479, in ray._raylet.execute_task
      File "hello1.py", line 17, in foo
        import pandas as pd
    ModuleNotFoundError: No module named 'pandas'

I have pandas installed at all the possible paths. I am unable to understand where exactly is the worker looking for pandas module that it is not finding it. Without the pandas import the code is running fine.

来源:https://stackoverflow.com/questions/64349519/error-in-ray-modulenotfounderror-no-module-named-pandas

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!