发表新帖

发表新帖

AWS EMR 5.11.0 - Apache Hive on Spark

前端未结

关注

 4  1757

温柔的废话

I am trying to setup Apache Hive on Spark on AWS EMR 5.11.0. Apache Spark Version - 2.2.1 Apache Hive Version - 2.3.2 Yarn logs show below error:

18/01/28 21:55:28

相关标签:

4条回答

2021-01-14 13:46

I be able run hive on spark by run it like:

HIVE_AUX_JARS_PATH=$(find /usr/lib/spark/jars/ -name '*.jar' -and -not -name '*slf4j-log4j12*' -printf '%p:' | head -c-1) hive

Then, before other SQL queries issue:

SET hive.execution.engine = spark;

To make that persistent

Add line

export HIVE_AUX_JARS_PATH=$(find /usr/lib/spark/jars/ -name '*.jar' -and -not -name '*slf4j-log4j12*' -printf '%p:' | head -c-1)

into /home/hadoop/.bashrc

And in file /etc/hive/conf/hive-site.xml set:

<property>
  <name>hive.execution.engine</name>
  <value>spark</value>
</property>

0 讨论(0)

滥情空心

2021-01-14 13:51

EMR Spark supports Hive version 1.2.1 and not the hive 2.x version. Could you please check the hive jar versions available in /usr/lib/spark/jars/ directory. SPARK_RPC_SERVER_ADDRESS is added in hive version 2.x.

0 讨论(0)
发布评论:

提交评论
- 加载中...
轻奢々

2021-01-14 13:56

The sbt or pom.xml to be like as follows.

"org.apache.spark" %% "spark-streaming" % sparkVersion % "provided",

"org.apache.spark" %% "spark-sql" % sparkVersion % "provided",

"org.apache.spark" %% "spark-hive" % sparkVersion % "provided",

I am running DataWarehouse (Hive) on EMR and spark application stored the data into DWH.

0 讨论(0)
发布评论:

提交评论
- 加载中...
北海茫月

2021-01-14 14:03

Sorry, but Hive on Spark is not yet supported on EMR. I have not tried it myself yet, but I think the likely cause of your errors might be a mismatch between the version of Spark supported on EMR and the version of Spark upon which Hive depends. The last time I checked, Hive did not support Spark 2.x when running Hive on Spark. Given that your first error is a NoSuchFieldError, it seems like a version mismatch is the most likely cause. The timeout error may be a red herring.

0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题