1)I have created a sql file where we are collecting the data from two different hive table and Inserting into a single Hive table,
2) we are invoking this SQL file using shell script
3)Sample Spark Setting:
SET hive.execution.engine=spark;
SET spark.master=yarn-cluster;
SET spark.app.name="ABC_${hiveconf:PRC_DT}_${hiveconf:JOB_ID}";
--SET spark.driver.memory=8g;
--SET spark.executor.memory=8g;
SET hive.exec.dynamic.partition.mode = nonstrict;
SET hive.stats.fetch.column.stats=true;
SET hive.optimize.index.filter=true;
Set hive.map.aggr=true;
Set hive.exec.parallel=true;SET spark.executor.cores=5;
SET hive.prewarm.enabled=true;
SET hive.spark.client.future.timeout=900;
SET hive.spark.client.server.connect.timeout=100000;
4)sample Hive Query:
insert OVERWRITE table ABC (a,b,c) select * from XYZ
from ${hiveconf:SCHEMA_NAME}.${hiveconf:TABLE_NAME}
where JOB_ID = '${hiveconf:JOB_ID}'
5)Sample Script:
hive -f $PARENTDIR/sql/test.sql --hiveconf SCHEMA_NAME=ABC --hiveconf TABLE_NAME=AB1 --hiveconf PRC_DT=${PRC_DT} --hiveconf JOB_ID=${JOB_ID}
hive -f $PARENTDIR/sql/test.sql --hiveconf SCHEMA_NAME=ABC --hiveconf TABLE_NAME=AB2 --hiveconf PRC_DT=${PRC_DT} --hiveconf JOB_ID=${JOB_ID}
2016-08-24 17:30:05,651 WARN [main] mapreduce.TableMapReduceUtil: The hbase-prefix-tree module jar containing PrefixTreeCodec is not present. Continuing without it.
Logging initialized using configuration in jar:file:/opt/cloudera/parcels/CDH-5.7.2-1.cdh5.7.2.p0.18/jars/hive-common-1.1.0-cdh5.7.2.jar!/hive-log4j.properties
FAILED: SemanticException Failed to get a spark session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client.
It's erroring because you're not getting an ApplicationMaster assigned prior to timing out. Increase the following parameter (defaults to 90000ms, you have it set to 100000ms above):
set hive.spark.client.server.connect.timeout=300000ms;