Wrong result for count(*) in hive table

后端 未结 3 1239
野的像风
野的像风 2021-01-02 08:35

I have created a table in HIVE

CREATE TABLE IF NOT EXISTS daily_firstseen_analysis (
    firstSeen         STRING,
    category          STRING,
    circle           


        
相关标签:
3条回答
  • 2021-01-02 08:55

    if you have an external table, remove all the files in HDFS, and insert into the table again then select count(*) will be incorrect.

    0 讨论(0)
  • 2021-01-02 09:01

    I execute ANALYZE TABLE ... at first is OK, but raise error when i try again.so i try:

    hive> REFRESH TABLE daily_firstseen_analysis;
    hive> SELECT COUNT(*) FROM daily_firstseen_analysis;
    

    this is explain

    0 讨论(0)
  • 2021-01-02 09:10

    I had the same problem, and using ANALYZE fixed it. Running these commands in order should give you the correct count:

    hive> ANALYZE TABLE daily_firstseen_analysis PARTITION(day) COMPUTE STATISTICS;
    hive> SELECT COUNT(*) FROM daily_firstseen_analysis;
    

    i.e. you have to use the analyze command before the count. You have half the answer within your question.

    0 讨论(0)
提交回复
热议问题