Where is the Delta table location stored?

北战南征 提交于 2020-03-25 21:59:29

问题


We just migrated to Databricks Delta from parquet using Hive metastore. So far everything seems to work fine, when I try to print out the location of the new Delta table using DESCRIBE EXTENDED my_table the location is correct although it is different than the one found in the hiveMetastore database. When I access the hiveMetastore database I can successfully identify the target table (also provider is correctly set to Delta). To retrieve the previous information I am executing a join between sds, dbs, tbls and table_params tables from hiveMetastore db, filtering by table name as shown next:

val sdsDF = spark.read
  .format("jdbc")
  .option("url", activeConnection.url)
  .option("dbtable", "hiveMetastore.SDS")
  .option("user", activeConnection.user)
  .option("password", activeConnection.pwd)
  .load()

val tblsDf = spark.read
  .format("jdbc")
  .option("url", activeConnection.url)
  .option("dbtable", "hiveMetastore.TBLS")
  .option("user", activeConnection.user)
  .option("password", activeConnection.pwd)
  .load()

val dbsDf = spark.read
  .format("jdbc")
  .option("url", activeConnection.url)
  .option("dbtable", "hiveMetastore.DBS")
  .option("user", activeConnection.user)
  .option("password", activeConnection.pwd)
  .load()

val paramsDf = spark.read
  .format("jdbc")
  .option("url", activeConnection.url)
  .option("dbtable", "hiveMetastore.TABLE_PARAMS")
  .option("user", activeConnection.user)
  .option("password", activeConnection.pwd)
  .load()

val resDf = sdsDF.join(tblsDf, "SD_ID")
     .join(dbsDf, "DB_ID") 
     .join(paramsDf, "TBL_ID") 
     .where('TBL_NAME.rlike("mytable"))
     .select($"TBL_NAME", $"TBL_TYPE", $"NAME".as("DB_NAME"), $"DB_LOCATION_URI", $"LOCATION".as("TABLE_LOCATION"), $"PARAM_KEY", $"PARAM_VALUE")

All the previous are executed from a databricks notebook.

My question is why I am getting two different locations even if the table name is the same? Where is the correct location for the Delta tables stored if not on hiveMetastore db?

来源:https://stackoverflow.com/questions/60614701/where-is-the-delta-table-location-stored

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!