Can I optimize a SELECT DISTINCT x FROM hugeTable query by creating an index on column x?

后端 未结 8 783
闹比i
闹比i 2021-01-01 11:51

I have a huge table, having a much smaller number (by orders of magnitude) of distinct values on some column x.

I need to do a query like SELECT D

相关标签:
8条回答
  • 2021-01-01 12:22

    If you know the values in advance and there is an index on column x (or if each value is likely to appear quickly on a seq scan of the whole table), it is much faster to query each one individually:

    select vals.x
    from [values] as vals (x)
    where exists (select 1 from bigtable where bigtable.x = vals.x);
    

    Proceeding using exists() will do as many index lookups as there are valid values.

    The way you've written it (which is correct if the values are not known in advance), the query engine will need to read the whole table and hash aggregate the mess to extract the values. (Which makes the index useless.)

    0 讨论(0)
  • 2021-01-01 12:24

    Possibly. Though it is not guaranteed - it entirely depends on the query.

    I suggest reading this article by Gail Shaw (part 1 and part 2).

    0 讨论(0)
提交回复
热议问题