Spark and Hive table schema out of sync after external overwrite

前端 未结 1 1548
萌比男神i
萌比男神i 2021-02-08 11:25

I\'m am having issues with the schema for Hive tables being out of sync between Spark and Hive on a Mapr cluster with Spark 2.1.0 and Hive 2.1.1.

I need to try to reso

1条回答
  •  悲&欢浪女
    2021-02-08 11:27

    I faced a similar issue while using spark 2.2.0 in CDH 5.11.x package.

    After spark.write.mode("overwrite").saveAsTable() when I issue spark.read.table().show no data will be displayed.

    On checking I found it was a known issue with CDH spark 2.2.0 version. Workaround for that was to run the below command after the saveAsTable command was executed.

    spark.sql("ALTER TABLE qualified_table set SERDEPROPERTIES ('path'='hdfs://{hdfs_host_name}/{table_path}')")
    
    spark.catalog.refreshTable("qualified_table")
    

    eg: If your table LOCATION is like hdfs://hdfsHA/user/warehouse/example.db/qualified_table
    then assign 'path'='hdfs://hdfsHA/user/warehouse/example.db/qualified_table'

    This worked for me. Give it a try. I assume by now your issue would have been resolved. If not you can try this method.

    workaround source: https://www.cloudera.com/documentation/spark2/2-2-x/topics/spark2_known_issues.html

    0 讨论(0)
提交回复
热议问题