No module named py4j.protocol on Eclipse (PyDev)

穿精又带淫゛_ 提交于 2020-01-06 21:43:09

问题


I configured Eclipse in order to develop with Spark and Python. I configured : 1. PyDev with the Python interpreter 2. PyDev with the Spark Python sources 3. PyDev with the Spark Environment variables.

This is my Libraries configuration :

And this is my Environment configuration :

I created a project named CompensationStudy and I want to run an small example and be sure that everything will go smoothly.

This is my code :

from pyspark import SparkConf, SparkContext
import os

sparkConf = SparkConf().setAppName("WordCounts").setMaster("local")
sc = SparkContext(conf = sparkConf)


textFile = sc.textFile(os.environ["SPARK_HOME"] + "/README.md")
wordCounts = textFile.flatMap(lambda line: line.split()).map(lambda word: (word, 1)).reduceByKey(lambda a, b: a+b)
for wc in wordCounts.collect(): print wc

But I got this error : ImportError: No module named py4j.protocol

Logicly, all of PySpark’s library dependencies, including Py4J, are automatically imported when I configure PyDev with the Spark Python sources.. So, what's wrong here ? Is there just a problem with my log4j.properties file ? Help please !


回答1:


Are you able to run it from the command line? I think the first step would be taking the IDE out of the question, so, try to get everything running with the proper environment variables in the command line (maybe asking for help to the pyspark community), after that's running, try comparing the env variables you have in your run to the run in the command line (create a program which runs the env variables and run it in the console and then in the IDE to check the difference).

One note (which is probably not the issue, but still...): from your screenshot, it seems that your project configuration has /CompensationStudy added to the PYTHONPATH, yet, you seem to be putting your code in /CompensationStudy/src (so, you should edit your project configuration to only put /CompensationStudy/src in the PYTHONPATH).




回答2:


Had similar error.

After installing py4j, able to execute without the error

sudo pip install py4j


来源:https://stackoverflow.com/questions/43070522/no-module-named-py4j-protocol-on-eclipse-pydev

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!