Why is a UDF so much slower than a subquery?

前端未结

关注

 4  1425

I have a case where I need to translate (lookup) several values from the same table. The first way I wrote it, was using subqueries:

SELECT
    (SELECT id FRO


                      
              相关标签:


      
      
        
          4条回答        

        
                         				            
            
           
            
                              
                
              
              
                
                  旧时难觅i        
                
              
                            
                2020-11-28 12:37
              
            
            
                                                                       
To get the same result (NULL if user is deleted or not active).

 select 
    u1.id as creator,
    u2.id as updater,
    u3.id as owner,
    [a.name]
 FROM asset a
        LEFT JOIN user u1 ON (u1.user_pk = a.created_by AND u1.active=1) 
        LEFT JOIN user u2 ON (u2.user_pk = a.created_by AND u2.active=1) 
        LEFT JOIN user u3 ON (u3.user_pk = a.created_by AND u3.active=1) 

                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  臣服心动        
                
              
                            
                2020-11-28 12:52
              
            
            
                                                                       
Am I missing something? Why can't this work? You are only selecting the id which you already have in the table:

select created_by as creator, updated_by as updater, 
owned_by as owner, [name]
from asset


By the way, in designing you really should avoid keywords, like name, as field names.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  别跟我提以往        
                
              
                            
                2020-11-28 12:55
              
            
            
                                                                       
As other posters have suggested, using joins will definitely give you the best overall performance.

However, since you've stated that that you don't want the headache of maintaining 50-ish similar joins or subqueries, try using an inline table-valued function as follows:

CREATE FUNCTION dbo.get_user_inline (@user_pk INT)
RETURNS TABLE AS
RETURN
(
    SELECT TOP 1 id
    FROM ice.dbo.[user]
    WHERE user_pk = @user_pk
        -- AND active = 1
)


Your original query would then become something like:

SELECT
    (SELECT TOP 1 id FROM dbo.get_user_inline(created_by)) AS creator,
    (SELECT TOP 1 id FROM dbo.get_user_inline(updated_by)) AS updater,
    (SELECT TOP 1 id FROM dbo.get_user_inline(owned_by)) AS owner,
    [name]
FROM asset


An inline table-valued function should have better performance than either a scalar function or a multistatement table-valued function.

The performance should be roughly equivalent to your original query, but any future changes can be made in the UDF, making it much more maintainable.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
            
           
            
                              
                
              
              
                
                  离开以前        
                
              
                            
                2020-11-28 12:58
              
            
            
                                                                       
The UDF is a black box to the query optimiser so it's executed for every row.
You are doing a row-by-row cursor. For each row in an asset, look up an id three times in another table. This happens when you use scalar or multi-statement UDFs (In-line UDFs are simply macros that expand into the outer query)

One of many articles on the problem is "Scalar functions, inlining, and performance: An entertaining title for a boring post".

The sub-queries can be optimised to correlate and avoid the row-by-row operations.

What you really want is this:

SELECT
   uc.id AS creator,
   uu.id AS updater,
   uo.id AS owner,
   a.[name]
FROM
    asset a
    JOIN
    user uc ON uc.user_pk = a.created_by
    JOIN
    user uu ON uu.user_pk = a.updated_by
    JOIN
    user uo ON uo.user_pk = a.owned_by


Update Feb 2019

SQL Server 2019 starts to fix this problem.
                                                                        
                                                        
            
            
              
                
                0
              
                 
                
               讨论(0)
              
              
                                                   
              
                                                            
            
                      
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
          	          
                             
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复