No module named 'resource' installing Apache Spark on Windows

前端 未结 4 2034
谎友^
谎友^ 2020-12-25 08:17

I am trying to install apache spark to run locally on my windows machine. I have followed all instructions here https://medium.com/@loldja/installing-apache-spark-pyspark-th

相关标签:
4条回答
  • 2020-12-25 08:48

    The fix can be found at https://github.com/apache/spark/pull/23055.

    The resource module is only for Unix/Linux systems and is not applicaple in a windows environment. This fix is not yet included in the latest release but you can modify the worker.py in your installation as shown in the pull request. The changes to that file can be found at https://github.com/apache/spark/pull/23055/files.

    You will have to re-zip the pyspark directory and move it the lib folder in your pyspark installation directory (where you extracted the pre-compiled pyspark according to the tutorial you mentioned)

    0 讨论(0)
  • 2020-12-25 08:56

    Adding to all those valuable answers,

    For windows users,Make sure you have copied the correct version of the winutils.exe file(for your specific version of Hadoop) to the spark/bin folder

    Say, if you have Hadoop 2.7.1, then you should copy the winutils.exe file from the Hadoop 2.7.1/bin folder

    Link for that is here

    https://github.com/steveloughran/winutils

    0 讨论(0)
  • 2020-12-25 09:02

    I struggled the whole morning with the same problem. Your best bet is to downgrade to Spark 2.3.2

    0 讨论(0)
  • 2020-12-25 09:04

    I edited worker.py file. Removed all resource-related lines. Actually # set up memory limits block and import resource. The error disappeared.

    0 讨论(0)
提交回复
热议问题