问题
I am a newbie in Spark.I want to write the dataframe data into hive table. Hive table is partitioned on mutliple column. Through, Hivemetastore client I am getting the partition column and passing that as a variable in partitionby clause in write method of dataframe.
var1="country","state" (Getting the partiton column names of hive table)
dataframe1.write.partitionBy(s"$var1").mode("overwrite").save(s"$hive_warehouse/$dbname.db/$temp_table/")
When I am executing the above code,it is giving me error partiton "country","state" does not exists. I think it is taking "country","state" as a string.
Can you please help me out.
回答1:
The partitionBy function takes a varargs
not a list. You can use this as
dataframe1.write.partitionBy("country","state").mode("overwrite").save(s"$hive_warehouse/$dbname.db/$temp_table/")
Or in scala you can convert a list into a varargs like
val columns = Seq("country","state")
dataframe1.write.partitionBy(columns:_*).mode("overwrite").save(s"$hive_warehouse/$dbname.db/$temp_table/")
来源:https://stackoverflow.com/questions/51569092/how-to-pass-multiple-column-in-partitionby-method-in-spark