问题
I saw application are droping external table and creating again then loading the data and runnning msck command every time data load..what is the benefit of this on every time dropping and creating?
回答1:
There is no benefit in dropping and recreating EXTERNAL
table, because dropping table leaves data intact.
Though there may be a benefit in dropping and re-creating MANAGED
table because it will drop data as well.
One possible scenario if you are running on S3:
Dropping files early before the load completes, not at the time of loading may reduce the possibility of eventual consistency issue in S3 after the load.
First of all, when the files dropped, you may hit EC issue (immediately after dropping and during some time) when reading table. Early drop of files will speed-up the S3 synchronizing.
Second, the eventual issue if you writing files with the same name (rewriting). Early dropping may help, though better to use guid-prefixed(unique) filenames or timestamp in partition folder path or some other similar technics for solving this kind (eventual consistency after rewriting).
来源:https://stackoverflow.com/questions/58703263/hive-table-re-create-before-load-every-date