These two queries seem to return the same results. Is that coincidental or are they really the same?
1.
SELECT t.ItemNumber,
(SELECT TOP 1 ItemDes
Your example #2 had me scratching me head for a while - I thought to myself: "You can't DISTINCT
a single column, what would that mean?" - until I realised what is going on.
When you have
SELECT DISTINCT(t.ItemNumber)
you are not, despite appearances, actually asking for distinct values of t.ItemNumber
! Your example #2 actually gets parsed the same as
SELECT DISTINCT
(t.ItemNumber)
,
(SELECT TOP 1 ItemDescription
FROM Transactions
WHERE ItemNumber = t.ItemNumber
ORDER BY DateCreated DESC) AS ItemDescription
FROM Transactions t
with syntactically-correct but superfluous parentheses around t.ItemNumber
. It is to the result-set as a whole that DISTINCT
applies.
In this case, since your GROUP BY
groups by the column that actually varies, you get the same results. I'm actually slightly surprised that SQL Server doesn't (in the GROUP BY
example) insist that the subqueried column is mentioned in the GROUP BY
list.
Yes they return the same results.
Normally the group by clause (found here) groups the rows by the specific column mentioned so if you have a sum in your select statement. Thus if you have a table like :
O_Id OrderDate OrderPrice Customer
1 2008/11/12 1000 Hansen
2 2008/10/23 1600 Nilsen
3 2008/09/02 700 Hansen
4 2008/09/03 300 Hansen
5 2008/08/30 2000 Jensen
6 2008/10/04 100 Nilsen
If you group by customer and ask for the sum or the order price you will get
Customer SUM(OrderPrice)
Hansen 2000
Nilsen 1700
Jensen 2000
Contrary to this the distinct (found here) just makes it so you don't have duplicate rows. In this case the original table would stay the same since each row is different from the others.