Impala GROUP BY partitioned column

拈花ヽ惹草 提交于 2019-12-12 04:14:16

问题


Theoretical question,

Lets say I have table with four columns : A,B,C,D. Values of A and D are equal, table is partitioned by column A.

Performance wise, would it make any difference if I issue this query SELECT SUM(B) GROUP BY A ; or this one : SELECT SUM(B) GROUP BY D ;

In different words I'm asking, is there any performance gain by using the GROUP BY on partitioned column ?

Thanks


回答1:


Usually there are performance gains if you use the partitioned columns on a filter (WHERE clause in your SQL)

since both queries use a "full table scan" it should not have a lot of difference between both queries. You might see a difference if theres is a lot of partitions (Like around 50K), with tends to degrade the query performance, but that is not usually the case.



来源:https://stackoverflow.com/questions/40494186/impala-group-by-partitioned-column

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!