Hive tables not found when running in YARN-Cluster mode

后端 未结 2 1173
离开以前
离开以前 2020-12-11 07:34

I have a Spark (version 1.4.1) application on HDP 2.3. It works fine when running it in YARN-Client mode. However, when running it on YARN-Cluster mode none of my Hive table

相关标签:
2条回答
  • 2020-12-11 08:20

    I posted this same question on the Hortonworks community, and I resolved the issue with the help of this answer.

    The gist of it is this: when submitting the application, the --files argument has to come before the --jars argument, and the copy of hive-site.xml to use is the one in the Spark conf dir, not in $HIVE_HOME/conf/hive-site.xml. Hence:

      ./bin/spark-submit \
      --class com.myCompany.Main \
      --master yarn-cluster \
      --num-executors 3 \
      --driver-memory 1g \
      --executor-memory 11g \
      --executor-cores 1 \
      --files /usr/hdp/current/spark-client/conf/hive-site.xml \
      --jars lib/datanucleus-api-jdo-3.2.6.jar,lib/datanucleus-rdbms-3.2.9.jar,lib/datanucleus-core-3.2.10.jar \
      /home/spark/apps/YarnClusterTest.jar
    
    0 讨论(0)
  • 2020-12-11 08:22

    If you are able to fetch data using Hive CLI, then use the same hive-site.xml in your Spark job.

    The only reason could be the location of metastore defined in hive-site.xml.

    0 讨论(0)
提交回复
热议问题