Using HDP 2.5.3 and I\'ve been trying to debug some YARN container classpath issues.
Since HDP includes both Spark 1.6 and 2.0.0, there have been some conflicting v
Found an issue with this
You create a org.apache.spark.sql.SQLContext
before creating hive context the hive-site.xml
is not picked properly when you create hive context.
Solution : Create the hive context before creating another SQL context.
You can use spark property - spark.yarn.dist.files
and specify path to hive-site.xml there.
In the cluster
mode
configuration is read from the conf
directory of the machine, which runs the driver
container, not the one use for spark-submit
.
The way I understand it, in local
or yarn-client
modes...
hive-site.xml
is searched in the CLASSPATH by the Hive/Hadoop client libs (including in driver.extraClassPath
because the Driver runs inside the Launcher and the merged CLASSPATH is already built at this point)$SPARK_CONF_DIR/hive-site.xml
hive-site.xml
is searched in the CLASSPATH by the Hive/Hadoop client libs (and the Kerberos token is used, if any)So you can have one hive-site.xml
stating that Spark should use an embedded, in-memory Derby instance to use as a sandbox (in-memory implying "stop leaving all these temp files behind you") while another hive-site.xml
gives the actual Hive Metastore URI. And all is well.
Now, in yarn-cluster
mode, all that mechanism pretty much explodes in a nasty, undocumented mess.
The Launcher needs its own CLASSPATH settings to create the Kerberos tokens, otherwise it fails silently. Better go to the source code to find out which undocumented Env variable you shoud use.
It may also need an override in some properties because the hard-coded defaults suddenly are not the defaults any more (silently).
The Driver cannot tap the original $SPARK_CONF_DIR
, it has to rely on what the Launcher has made available for upload. Does that include a copy of $SPARK_CONF_DIR/hive-site.xml
? Looks like it's not the case.
So you are probably using a Derby thing as a stub.
And the Driver has to to do with whatever YARN has forced on the container CLASSPATH, in whatever order.
Besides, the driver.extraClassPath
additions do NOT take precedence by default; for that you have to force spark.yarn.user.classpath.first=true
(which is translated to the standard Hadoop property whose exact name I can't remember right now, especially since there are multiple props with similar names that may be deprecated and/or not working in Hadoop 2.x)
yarn-cluster
mode. The connection is done in the Executors, that's another layer of nastyness. But I disgress.
Bottom line: start your diagnostic again.
A. Are you really, really sure that the mysterious "Metastore connection errors" are caused by missing properties, and specifically the Metastore URI?
B. By the way, are your users explicitly using a HiveContext
???
C. What is exactly the CLASSPATH that YARN presents to the Driver JVM, and what is exactly the CLASSPATH that the Driver presents to the Hadoop libs when opening the Metastore connection?
D. If the CLASSPATH built by YARN is messed up for some reason, what would be the minimal fix -- change in precedence rules? addition? both?