psutil in Apache Spark

╄→гoц情女王★ 提交于 2020-01-02 02:28:28

问题


I'm using PySpark 1.5.2. I got UserWarning Please install psutil to have better support with spilling after I issue the command .collect()

Why is this warning showed?

How can I install psutil?


回答1:


pip install psutil

If you need to install specifically for python 2 or 3, try using pip2 or pip3; it works for both major versions. Here is the PyPI package for psutil.




回答2:


y can clone or download the psutil project in the following link: https://github.com/giampaolo/psutil.git

then run setup.py to install psutil

in 'spark/python/pyspark/shuffle.py' y can see the following codes:

def get_used_memory():
    """ Return the used memory in MB """
    if platform.system() == 'Linux':
        for line in open('/proc/self/status'):
            if line.startswith('VmRSS:'):
                return int(line.split()[1]) >> 10

    else:
        warnings.warn("Please install psutil to have better "
                      "support with spilling")**
        if platform.system() == "Darwin":
            import resource
            rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
            return rss >> 20
        # TODO: support windows

    return 0

so i guess if yr os is not a linux, so psutil is suggested.



来源:https://stackoverflow.com/questions/34503549/psutil-in-apache-spark

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!