问题
I installed Zeppelin on Windows using this tutorial and this. I also installed java 8 to avoid problems.
I'm now able to start the Zeppelin server, and I'm trying to run this code -
%pyspark
a=5*4
print("value = %i" % (a))
sc.version
I'm getting this error, related to py4j
. I had other problems with this library before (same as here), and to avoid them I replaced the library of py4j
in the Zeppelin and Spark on my computer with the latest version- py4j 0.10.7
.
This is the error I get-
Traceback (most recent call last):
File "C:\Users\SHIRM~1.ARG\AppData\Local\Temp\zeppelin_pyspark-1240802621138907911.py", line 309, in <module>
sc = _zsc_ = SparkContext(jsc=jsc, gateway=gateway, conf=conf)
File "C:\Users\SHIRM.ARGUS\spark-2.3.2\spark-2.3.2-bin-hadoop2.7\python\pyspark\context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "C:\Users\SHIRM.ARGUS\spark-2.3.2\spark-2.3.2-bin-hadoop2.7\python\pyspark\context.py", line 189, in _do_init
self._javaAccumulator = self._jvm.PythonAccumulatorV2(host, port, auth_token)
File "C:\Users\SHIRM.ARGUS\Documents\zeppelin-0.8.0-bin-all\interpreter\spark\pyspark\py4j-0.10.7-src.zip\py4j\java_gateway.py", line 1525, in __call__
File "C:\Users\SHIRM.ARGUS\Documents\zeppelin-0.8.0-bin-all\interpreter\spark\pyspark\py4j-0.10.7-src.zip\py4j\protocol.py", line 332, in get_return_value
py4j.protocol.Py4JError: An error occurred while calling None.org.apache.spark.api.python.PythonAccumulatorV2. Trace:
I googled it, but couldn't find anyone that it had happened to.
Does anyone have an idea how can I solve this?
Thanks
回答1:
I feel you have installed Java 9 or 10. Uninstall either of those versions and install a fresh copy of Java 8 from here: https://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html
And set JAVA_HOME inside hadoop_env.cmd (open with any text-editor).
Note: Java 8 or 7 are stable versions to use and uninstall any existing versions of JAVA. Make sure you add JDK (not JRE) in JAVA_HOME.
回答2:
I faced the same problem today, and I fixed it by adding PYTHONPATH
in the system environment like:%SPARK_HOME%\python\lib\py4j;%SPARK_HOME%\python\lib\pyspark
来源:https://stackoverflow.com/questions/52646868/using-pyspark-on-windows-not-working-py4j