How to refresh a table and do it concurrently?

前端 未结 2 1720
难免孤独
难免孤独 2021-02-08 16:18

I\'m using Spark Streaming 2.1. I\'d like to refresh some cached table (loaded by spark provided DataSource like parquet, MySQL or user-defined data sources) periodically.

2条回答
  •  栀梦
    栀梦 (楼主)
    2021-02-08 17:17

    In Spark 2.2.0 they have introduced feature of refreshing the metadata of a table if it was updated by hive or some external tools.

    You can achieve it by using the API,

    spark.catalog.refreshTable("my_table")
    

    This API will update the metadata for that table to keep it consistent.

提交回复
热议问题