Hive tables not found when running in YARN-Cluster mode

后端未结

关注

 2  1173

I have a Spark (version 1.4.1) application on HDP 2.3. It works fine when running it in YARN-Client mode. However, when running it on YARN-Cluster mode none of my Hive table

相关标签:

2条回答

醉梦人生

2020-12-11 08:20
I posted this same question on the Hortonworks community, and I resolved the issue with the help of this answer.

The gist of it is this: when submitting the application, the --files argument has to come before the --jars argument, and the copy of hive-site.xml to use is the one in the Spark conf dir, not in $HIVE_HOME/conf/hive-site.xml. Hence:
```
  ./bin/spark-submit \
  --class com.myCompany.Main \
  --master yarn-cluster \
  --num-executors 3 \
  --driver-memory 1g \
  --executor-memory 11g \
  --executor-cores 1 \
  --files /usr/hdp/current/spark-client/conf/hive-site.xml \
  --jars lib/datanucleus-api-jdo-3.2.6.jar,lib/datanucleus-rdbms-3.2.9.jar,lib/datanucleus-core-3.2.10.jar \
  /home/spark/apps/YarnClusterTest.jar
```
0 讨论(0)
发布评论:

提交评论
- 加载中...
佛祖请我去吃肉

2020-12-11 08:22

If you are able to fetch data using Hive CLI, then use the same hive-site.xml in your Spark job.

The only reason could be the location of metastore defined in hive-site.xml.

0 讨论(0)
发布评论:

提交评论
- 加载中...