问题
I am trying to create two dataframe and join it using dataframe.join method.
Here is scala code:
import org.apache.spark.sql.SparkSession
import org.apache.spark.SparkConf
object RuleExecutor {
def main(args: Array[String]): Unit = {
val sparkConf = new SparkConf().setAppName(AppConstants.AppName).setMaster("local")
val sparkSession = SparkSession.builder().appName(AppConstants.AppName).config(sparkConf).enableHiveSupport().getOrCreate()
import sparkSession.sql
sql(s"CREATE DATABASE test")
sql ("CREATE TABLE test.box_width (id INT, width INT)") // Create table box_width
sql ("INSERT INTO test.box_width VALUES (1,1), (2,2)") // Insert data in box_width
sql ("CREATE TABLE test.box_length (id INT, length INT)") // Create table box_length
sql ("INSERT INTO test.box_length VALUES (1,10), (2,20)") // Insert data in box_length
val widthDF = sql("select * from test.box_width") // Get DF for table box_width
val lengthDF = sql("select * from test.box_length") // Get DF for table box_length
val dimensionDF = lengthDF.join(widthDF, "id"); // Joining
dimensionDF.show();
}
}
But when running code, I am getting following error:
Exception in thread "main" java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionStateBuilder':
at org.apache.spark.sql.SparkSession$.org$apache$spark$sql$SparkSession$$instantiateSessionState(SparkSession.scala:1062)…..
Caused by: org.apache.spark.sql.AnalysisException: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient;
at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:106)……
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522)……
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient
at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523)……
Caused by: java.lang.reflect.InvocationTargetException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)…
Caused by: org.datanucleus.api.jdo.exceptions.ClassNotPersistenceCapableException: The class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not persistable. This means that it either hasnt been enhanced, or that the enhanced version of the file is not in the CLASSPATH (or is hidden by an unenhanced version), or the Meta-Data/annotations for the class are not found.
NestedThrowables:
org.datanucleus.exceptions.ClassNotPersistableException: The class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not persistable. This means that it either hasnt been enhanced, or that the enhanced version of the file is not in the CLASSPATH (or is hidden by an unenhanced version), or the Meta-Data/annotations for the class are not found.
at org.datanucleus.api.jdo.NucleusJDOHelper.getJDOExceptionForNucleusException(NucleusJDOHelper.java:473)……
Caused by: org.datanucleus.exceptions.ClassNotPersistableException: The class "org.apache.hadoop.hive.metastore.model.MVersionTable" is not persistable. This means that it either hasnt been enhanced, or that the enhanced version of the file is not in the CLASSPATH (or is hidden by an unenhanced version), or the Meta-Data/annotations for the class are not found.
at org.datanucleus.ExecutionContextImpl.assertClassPersistable(ExecutionContextImpl.java:5113)……
Versions that I am using are
Scala = 2.11
Spark-hive = 2.2.2
Maven-org-spark-project-hive_hive-metastore = 1.x
DataNucleus = 5.x
How to resolve this issue? complete log list of dependencies
Thanks
回答1:
First of all you don't need to use ;
at the end of lines anymore, unless you have more than one expression in one line while writing Scala code.
Second, I went through your log, and there are 15 errors, mainly either the database table is not there or can't find hive. So I think these instances are not running correctly. Could you make sure you have all those things (Hive, MySql DB) setup correctly, before running the spark job?
来源:https://stackoverflow.com/questions/52628227/scala-and-sparksql-classnotpersistableexception