SQL Distinct keyword bogs down performance?

前端 未结 4 563
长发绾君心
长发绾君心 2021-01-02 09:28

I have received a SQL query that makes use of the distinct keyword. When I tried running the query it took at least a minute to join two tables with hundreds of thousands of

4条回答
  •  说谎
    说谎 (楼主)
    2021-01-02 10:02

    Purpose of DISTINCT is to prune duplicate records from the result set for all the selected columns.

    • If any of the selected columns is unique after join you can drop DISTINCT.
    • If you don't know that, but you know that the combination of the values of selected column is unique, you can drop DISTINCT.

    Actually, normally, with properly designed databases you rarely need DISTINCT and in those cases that you do it is (?) obvious that you need it. RDBMS however can not leave it to chance and must actually build an indexing structure to establish it.

    Normally you find DISTINCT all over the place when people are not sure about JOINs and relationships between tables.

    Also, in classes when talking about pure relational databases where the result should be a proper set (with no repeating elements = records) you can find it quite common for people to stick DISTINCT in to guarantee this property for purposes of theoretical correctness. Sometimes this creeps in into production systems.

提交回复
热议问题