问题
I would like to know what is the best possible way(s) of adding partitions to the external table. I have a external table on S3 in hive with the partition as vehicle=/date=/hr=
Now new vehicle can be added at any time of day and there will be vehicles which will not have data for a couple of hours in a day or for couple of days.
Few possible solutions - msck reapir table : It takes a lot of time - Add partition via script : I may not know when new vehicle gets created or which hour data is not there for a vehicle
How do generally people solve this problem of adding partitions to the external tables
回答1:
msck reapir table
is a right way to do this. If it runs too slow, try to switch off stats autogather before repair table:
set hive.stats.autogather=false;
You can enable it again after recovering partitions.
Most probably you are hitting HIVE-18743 or related bug. In my case this helped.
来源:https://stackoverflow.com/questions/57882477/adding-partitions-to-the-external-table-in-hive-takes-a-lot-of-time