Are these two queries the same - GROUP BY vs. DISTINCT?

前端未结

关注

 8  1728

These two queries seem to return the same results. Is that coincidental or are they really the same?

SELECT t.ItemNumber,
  (SELECT TOP 1 ItemDes


                      
              相关标签:


      
      
        
          8条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  猫巷女王i        
                
              
                            
                2021-01-05 08:19
              
            
            
                                                                       
If you're running at least 2005 and can use a CTE, this is a little cleaner IMHO.

EDIT: As pointed out in Martin's answer, this also performs much better.

;with cteMaxDate as (
    select t.ItemNumber, max(DateCreated) as MaxDate
        from Transactions t
        group by t.ItemNumber
)
SELECT t.ItemNumber, t.ItemDescription
    FROM cteMaxDate md
        inner join Transactions t
            on md.ItemNumber = t.ItemNumber
                and md.MaxDate = t.DateCreated

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  暖寄归人        
                
              
                            
                2021-01-05 08:21
              
            
            
                                                                       
Since you're not using any aggregate functions, SQL Server should be smart enough to treat the GROUP BY as a DISTINCT.

You may also be interested in checking out the following Stack Overflow post for further reading on this topic:


Is there any difference between Group By and Distinct?

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  北海茫月        
                
              
                            
                2021-01-05 08:37
              
            
            
                                                                       
Same results but the second one seems to have a more expensive sort step to apply the DISTINCT on my quick test.

Both were beaten out of sight by ROW_NUMBER though...

with T as
(
SELECT ItemNumber, 
       ItemDescription,
       ROW_NUMBER() OVER ( PARTITION BY ItemNumber ORDER BY DateCreated DESC) AS RN
FROM Transactions
)
SELECT * FROM T
WHERE RN=1


edit ...which in turn was thumped by Joe's solution on my test setup.




Test Setup

CREATE TABLE Transactions
(
ItemNumber INT not null,
ItemDescription VARCHAR(50) not null,
DateCreated DATETIME not null
)

INSERT INTO Transactions
SELECT 
number, NEWID(),DATEADD(day, cast(rand(CAST(newid() as varbinary))*10000 
  as int),getdate()) 
FROM master.dbo.spt_values

ALTER TABLE dbo.Transactions ADD CONSTRAINT
    PK_Transactions PRIMARY KEY CLUSTERED 
    (ItemNumber,DateCreated) 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  闹比i        
                
              
                            
                2021-01-05 08:38
              
            
            
                                                                       
GROUP BY is needed to properly return results when using aggregate functions in a sql query. As you are not using an aggregate function, there is no need for the GROUP BY, and thus the queries are the same.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  独厮守ぢ        
                
              
                            
                2021-01-05 08:39
              
            
            
                                                                       
Based on the data & simple queries, both will return the same results.  However, the fundamental operations are very different.

DISTINCT, as AakashM beat me to pointing out, is applied to all column values, including those from subselects and computed columns.  All DISTINCT does is remove duplicates, based on all columns involved, from visibility.  This is why it's generally considered a hack, because people will use it to get rid of duplicates without understanding why the query is returning them in the first place (because they should be using IN or EXISTS rather than a join, typically).  PostgreSQL is the only database I know of with a DISTINCT ON clause, which does work as the OP probably intended.

A GROUP BY clause is different - it's primary use is for grouping for accurate aggregate function use.  To server that function, column values will be unique values based on what's defined in the GROUP BY clause.  This query would never need DISTINCT, because the values of interest are already unique.

Conclusion

This is a poor example, because it portrays DISTINCT and GROUP BY as equals when they are not.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  無奈伤痛        
                
              
                            
                2021-01-05 08:39
              
            
            
                                                                       
Yes, they will return the same results. 
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
   
          
     1
2
下一页
           
           
        
                                  
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复