问题
I have one dataframe in Spark I'm saving it in my hive as a table.But getting below error message.
java.lang.RuntimeException:
com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector
does not allow create table as select.at scala.sys.package$.error(package.scala:27)
can anyone please help me how should i save this as table in hive.
val df3 = df1.join(df2, df1("inv_num") === df2("inv_num") // Join both dataframes on id column
).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary"))
.otherwise(
when(df1("salary") > df2("salary"), df1("salary") + df2("salary")) // 5000+3000=8000 check
.otherwise(df2("salary")))) // insert from second dataframe
.drop(df1("salary"))
.drop(df2("salary"))
.withColumnRenamed("finalSalary","salary")
}
}
//below code is not working when I'm executing below command its throwing error as
java.lang.RuntimeException:
com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector
does not allow create table as select.at scala.sys.package$.error(package.scala:27)
df3.write.
format("com.hortonworks.spark.sql.hive.llap.HiveWarehouseConnector")
.option("database", "dbname")
.option("table", "tablename")
.mode("Append")
.saveAsTable("tablename")
Note: Table is already available in database and I m using HDP 3.x.
回答1:
According to the spark documentation the behaviour of the saveAsTable
function changes with the mode used, by default is ErrofIfExist
.
In your case, that you are using Hive, try with insertInto
, but keep in mind that the order of the columns of the dataframe must be the same as the destiny.
回答2:
Try registerTempTable
and then -> spark.sql()
-> then write
df3.registerTempTable("tablename");
spark.sql("SELECT salary FROM tablename")
.write.format(HIVE_WAREHOUSE_CONNECTOR)
.option("database", "dbname")
.option("table", "tablename")
.mode("Append")
.option("table", "newTable")
.save()
回答3:
See if below solution works for you,
val df3 = df1.join(df2, df1("inv_num") === df2("inv_num") // Join both dataframes on id column
).withColumn("finalSalary", when(df1("salary") < df2("salary"), df2("salary") - df1("salary"))
.otherwise(
when(df1("salary") > df2("salary"), df1("salary") + df2("salary")) // 5000+3000=8000 check
.otherwise(df2("salary")))) // insert from second dataframe
.drop(df1("salary"))
.drop(df2("salary"))
.withColumnRenamed("finalSalary","salary")
val hive = com.hortonworks.spark.sql.hive.llap.HiveWarehouseBuilder.session(spark).build()
df3.createOrReplaceTempView("<temp-tbl-name>")
hive.setDatabase("<db-name>")
hive.createTable("<tbl-name>")
.ifNotExists()
sql("SELECT salary FROM <temp-tbl-name>")
.write
.format(HIVE_WAREHOUSE_CONNECTOR)
.mode("append")
.option("table", "<tbl-name>")
.save()
来源:https://stackoverflow.com/questions/61819955/saveastable-in-spark-scala-hdp3-x