Table loaded through Spark not accessible in Hive

后端 未结 2 1902
暗喜
暗喜 2021-01-18 12:26

Hive table created through Spark (pyspark) are not accessible from Hive.

df.write.format(\"orc\").mode(\"overwrite\").saveAsTable(\"db.table\")
相关标签:
2条回答
  • 2021-01-18 12:47

    From HDP 3.0, catalogs for Apache Hive and Apache Spark are separated, and they use their own catalog; namely, they are mutually exclusive - Apache Hive catalog can only be accessed by Apache Hive or this library, and Apache Spark catalog can only be accessed by existing APIs in Apache Spark . In other words, some features such as ACID tables or Apache Ranger with Apache Hive table are only available via this library in Apache Spark. Those tables in Hive should not directly be accessible within Apache Spark APIs themselves.

    • Below article explain the steps:

    Integrating Apache Hive with Apache Spark - Hive Warehouse Connector

    0 讨论(0)
  • 2021-01-18 12:49

    I faced the same issue after setting the following properties, it is working fine.

    set hive.mapred.mode=nonstrict;
    set hive.optimize.ppd=true;
    set hive.optimize.index.filter=true;
    set hive.tez.bucket.pruning=true;
    set hive.explain.user=false; 
    set hive.fetch.task.conversion=none;
    set hive.support.concurrency=true;
    set hive.txn.manager=org.apache.hadoop.hive.ql.lockmgr.DbTxnManager;
    
    0 讨论(0)
提交回复
热议问题