Are these two queries the same - GROUP BY vs. DISTINCT?

前端 未结 8 1727
無奈伤痛
無奈伤痛 2021-01-05 07:38

These two queries seem to return the same results. Is that coincidental or are they really the same?

1.

SELECT t.ItemNumber,
  (SELECT TOP 1 ItemDes         


        
相关标签:
8条回答
  • 2021-01-05 08:40

    Your example #2 had me scratching me head for a while - I thought to myself: "You can't DISTINCT a single column, what would that mean?" - until I realised what is going on.

    When you have

    SELECT DISTINCT(t.ItemNumber)
    

    you are not, despite appearances, actually asking for distinct values of t.ItemNumber! Your example #2 actually gets parsed the same as

    SELECT DISTINCT
      (t.ItemNumber)
      ,
      (SELECT TOP 1 ItemDescription
       FROM Transactions
       WHERE ItemNumber = t.ItemNumber
       ORDER BY DateCreated DESC) AS ItemDescription
    FROM Transactions t
    

    with syntactically-correct but superfluous parentheses around t.ItemNumber. It is to the result-set as a whole that DISTINCT applies.

    In this case, since your GROUP BY groups by the column that actually varies, you get the same results. I'm actually slightly surprised that SQL Server doesn't (in the GROUP BY example) insist that the subqueried column is mentioned in the GROUP BY list.

    0 讨论(0)
  • 2021-01-05 08:42

    Yes they return the same results.

    Normally the group by clause (found here) groups the rows by the specific column mentioned so if you have a sum in your select statement. Thus if you have a table like :

    O_Id        OrderDate   OrderPrice      Customer
    1           2008/11/12  1000            Hansen
    2           2008/10/23  1600            Nilsen
    3           2008/09/02  700             Hansen
    4           2008/09/03  300             Hansen
    5           2008/08/30  2000            Jensen
    6           2008/10/04  100             Nilsen
    

    If you group by customer and ask for the sum or the order price you will get

    Customer    SUM(OrderPrice)
    Hansen          2000
    Nilsen             1700
    Jensen          2000
    

    Contrary to this the distinct (found here) just makes it so you don't have duplicate rows. In this case the original table would stay the same since each row is different from the others.

    0 讨论(0)
提交回复
热议问题