问题
Theoretical question,
Lets say I have table with four columns : A,B,C,D. Values of A and D are equal, table is partitioned by column A.
Performance wise, would it make any difference if I issue this query SELECT SUM(B) GROUP BY A ; or this one : SELECT SUM(B) GROUP BY D ;
In different words I'm asking, is there any performance gain by using the GROUP BY on partitioned column ?
Thanks
回答1:
Usually there are performance gains if you use the partitioned columns on a filter (WHERE clause in your SQL)
since both queries use a "full table scan" it should not have a lot of difference between both queries. You might see a difference if theres is a lot of partitions (Like around 50K), with tends to degrade the query performance, but that is not usually the case.
来源:https://stackoverflow.com/questions/40494186/impala-group-by-partitioned-column