问题
I'm using PySpark 1.5.2. I got UserWarning Please install psutil to have better support with spilling
after I issue the command .collect()
Why is this warning showed?
How can I install psutil
?
回答1:
pip install psutil
If you need to install specifically for python 2 or 3, try using pip2
or pip3
; it works for both major versions. Here is the PyPI package for psutil.
回答2:
y can clone or download the psutil project in the following link: https://github.com/giampaolo/psutil.git
then run setup.py to install psutil
in 'spark/python/pyspark/shuffle.py' y can see the following codes:
def get_used_memory():
""" Return the used memory in MB """
if platform.system() == 'Linux':
for line in open('/proc/self/status'):
if line.startswith('VmRSS:'):
return int(line.split()[1]) >> 10
else:
warnings.warn("Please install psutil to have better "
"support with spilling")**
if platform.system() == "Darwin":
import resource
rss = resource.getrusage(resource.RUSAGE_SELF).ru_maxrss
return rss >> 20
# TODO: support windows
return 0
so i guess if yr os is not a linux, so psutil is suggested.
来源:https://stackoverflow.com/questions/34503549/psutil-in-apache-spark