Getting NullPointerException when running Spark Code in Zeppelin 0.7.1

后端未结

关注

 9  1011

I have installed Zeppelin 0.7.1. When I tried to execute the Example spark program(which was available with Zeppelin Tutorial notebook), I am getting t

相关标签:

9条回答

既然无缘

2021-02-05 11:24
I was getting the exactly same exception for zepelline 0.7.2 version on window 7. I had to do multiple changes into the configuration to make it work.

First rename the zeppelin-env.cmd.template to zeppelin-env.cmd. Add the env variable for PYTHONPATH. The file can be located at %ZEPPELIN_HOME%/conf folder.
```
set PYTHONPATH=%SPARK_HOME%\python;%SPARK_HOME%\python\lib\py4j-0.10.4-src.zip;%SPARK_HOME%\python\lib\pyspark.zip
```
Open the zeppelin.cmd from location %ZEPPELIN_HOME%/bin to add a %SPARK_HOME% and %ZEPPELIN_HOME%. Those will be the first lines in the instruction. The value for %SPARK_HOME% was configured as blank as I was using the embedded spark library.I added %ZEPPELIN_HOME% to make sure this env is configured at the initial stage of startup.
```
set SPARK_HOME=
set ZEPPELIN_HOME=<PATH to zeppelin installed folder>
```
Next we will have to copy all the jar and pySpark from the %spark_home%/ to zeppeline folder.
```
cp %SPARK_HOME%/jar/*.jar %ZEPPELIN_HOME%/interpreter/spark
cp %SPARK_HOME%/python/pyspark %ZEPPELIN_HOME%/interpreter/spark/pyspark
```
I wasn't starting the interpreter.cmd while accessing the notebook. This was causing the nullpointer exception. I opened two command prompt and in one cmd I started zeppeline.cmd and in the other interpreter.cmd.

We have to specify two additional input port and path to zeppeline local_repo in command line. You can get the path to local_repo in zeppeline spark interpreter page. Use exactly same path to start the interpreter.cmd.
```
interpreter.cmd  -d %ZEPPELIN_HOME%\interpreter\spark\ -p 5050  -l %ZEPPELIN_HOME%\local-repo\2D64VMYZE
```
The host and port needs to be specified in the spark interpreter page in zepelline ui. Select the Connect to external Process
```
HOST : localhost
PORT : 5050
```
Once all these on configuration are created, on next step we can save and restart the spark interpreter. Create a new notebook and type sc.version. It will publish the spark version. Zeppeline 0.7.2 doesn't support spark 2.2.1
0 讨论(0)
发布评论:

提交评论
- 加载中...

北荒

2021-02-05 11:25

    enterCaused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at org.apache.thrift.transport.TSocket.open(TSocket.java:182)
        ... 74 more
)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:466)
        at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236)
        at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74)
        ... 71 more
 INFO [2017-11-20 17:51:55,288] ({pool-2-thread-4} SparkInterpreter.java[createSparkSession]:369) - Created Spark session with Hive support
ERROR [2017-11-20 17:51:55,290] ({pool-2-thread-4} Job.java[run]:181) - Job failed code here

It looks like Hive Metastore service not started. You can start the Metastore service and try again.

hive --service metastore

0 讨论(0)

我寻月下人不归

2021-02-05 11:26

Finally, I am able to find out the reason. When I checked the logs in ZL_HOME/logs directory, find out it seems to be the Spark Driver binding error. Added the following property in Spark Interpreter Binding and works good now...

PS : Looks like this issue comes up mainly if you connect to VPN...and I do connect to VPN

0 讨论(0)
发布评论:

提交评论
- 加载中...
离开以前

2021-02-05 11:29

Seems to be bug in Zeppelin 0.7.1. Works fine in 0.7.2.

0 讨论(0)
发布评论:

提交评论
- 加载中...
别跟我提以往

2021-02-05 11:33
On AWS EMR the issue was memory. I had to manually set lower value for spark.executor.memory in the Interpeter for Spark using the UI of Zeppelin.

The value varies based on your instance size. The best is to check the logs located in the /mnt/var/log/zeppelin/ folder.

In my case the underlying error was:
```
Error initializing SparkContext.
java.lang.IllegalArgumentException: Required executor memory (6144+614 MB) is above the max threshold (6144 MB) of this cluster! Please check the values of 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'.
```
That helped me understand why it was failing and what I can do to fix it.

Note:

This happened because I was starting an instance with HBase which limits the available memory. See the defaults for instance size here.
0 讨论(0)
发布评论:

提交评论
- 加载中...
心在旅途

2021-02-05 11:34
Check if your NameNode have gone in safe mode.

check with below syntax:
```
sudo -u hdfs hdfs dfsadmin -safemode get
```
to leave from safe mode use below command:
```
sudo -u hdfs hdfs dfsadmin -safemode leave
```
0 讨论(0)
发布评论:

提交评论
- 加载中...

1 2 下一页