发表新帖

发表新帖

Spark and Hive table schema out of sync after external overwrite

前端未结

关注

 1  1554

萌比男神i 2021-02-08 11:25

I\'m am having issues with the schema for Hive tables being out of sync between Spark and Hive on a Mapr cluster with Spark 2.1.0 and Hive 2.1.1.

I need to try to reso

1条回答

悲&欢浪女 (楼主)

2021-02-08 11:27
I faced a similar issue while using spark 2.2.0 in CDH 5.11.x package.

After spark.write.mode("overwrite").saveAsTable() when I issue spark.read.table().show no data will be displayed.

On checking I found it was a known issue with CDH spark 2.2.0 version. Workaround for that was to run the below command after the saveAsTable command was executed.
```
spark.sql("ALTER TABLE qualified_table set SERDEPROPERTIES ('path'='hdfs://{hdfs_host_name}/{table_path}')")

spark.catalog.refreshTable("qualified_table")
```
eg: If your table LOCATION is like hdfs://hdfsHA/user/warehouse/example.db/qualified_table
then assign 'path'='hdfs://hdfsHA/user/warehouse/example.db/qualified_table'

This worked for me. Give it a try. I assume by now your issue would have been resolved. If not you can try this method.

workaround source: https://www.cloudera.com/documentation/spark2/2-2-x/topics/spark2_known_issues.html
0 讨论(0)
发布评论:

提交评论
- 加载中...

热议问题