问题
I am creating hive table in Google Cloud Bucket using below SQL statement.
CREATE TABLE schema_name.table_name (column1 decimal(10,0), column2 int, column3 date)
PARTITIONED BY(column7 date) STORED AS ORC
LOCATION 'gs://crazybucketstring/'
TBLPROPERTIES('ORC.COMPRESS'='SNAPPY');
Then I loaded data into this table using distcp
command, Now when I try to Drop table it fails with below error message, Even if I try to drop empty table it fails.
hive>>DROP TABLE schema_name.table_name;
**Error:** Error while processing statement:
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.MetaException
(message:java.lang.IllegalArgumentException: `hadoopPath must not be null`)
(state=08S01,code=1)
I also removed files from Google Cloud Storage bucket using gsutil rm -r gs://
command but still not able to delete table and giving same error
Also on running msck repair table
it is giving following error.
FAILED:
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
Any Idea what could be wrong?
回答1:
The problem is related to bucket location. I will try to explain it in step by step manner on how to recreate it and how to solve it. this same issue also result in unable to run msck repair
command.
How to Recreate it:
First I created a table (T1) with location pointing to the bucket given here:
LOCATION 'gs://crazybucketstring/'
Then I created another table (T2) in-side bucket in subfolder with location as given below
LOCATION gs://crazybucketstring/schemname/tableaname/
Now when I try to drop first table (T1) it throws error as entire bucket is behaving as table and it can't delete bucket, it can just delete files.
When I try to drop table (T2) I am able to drop it and also files inside bucket subdirectory is deleted as it is managed table. Table T1 is still a headache.
In a desperate bid to delete Table T1, I emptied the bucket using gsutil rm -r
command and tried msck repair table tablename
and strangely msck repair
command failed with below error message
>> msck repair table tablename
Error: Error while processing statement: FAILED:
Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask (state=08S01,code=1)
As usual DROP command was still not working.
Solution:
Eventually I got this Idea which worked.
- I Altered Table T1 and SET its location to subdirectory inside bucket instead of bare bucket.
ALTER TABLE TABLENAME SET LOCATION gs://crazybucketstring/schemname/tableaname/
- Now I do 'msck repair' and it doesn't throw any error.
- I issued DROP Table command and it worked.
This issue is related to Table Location which we should deal with carefully while creating more than 1 Table in same bucket. Best practice is to use different subdirectories inside bucket to create different tables and avoid using just bucket path as table location specially if you have to create multiple tables in same bucket. Thank you and feel free to reach out to Me for Big Data issues.
来源:https://stackoverflow.com/questions/63146214/drop-hive-table-msck-repair-fails-with-table-stored-in-google-cloud-bucket