How do I exclude outliers from an aggregate query?

后端未结

关注

 4  1814

情书的邮戳 2021-02-10 09:30

I\'m creating a report comparing total time and volume across units. Here a simplification of the query I\'m using at the moment:

SELECT  m.Unit,
        COUNT(


      
      
        
          4条回答        

        
                    
            
            
                         
                
              
              
                
                   谎友^
                                             
                
                
                (楼主)
            
              
              
                2021-02-10 09:47
              

            
            
                        
NTile is quite inexact. If you run NTile against the sample view below, you will see that it catches some indeterminate number of rows instead of 90% from the center.  The suggestion to use TOP 95%, then reverse TOP 90% is almost correct except that 90% x 95% gives you only 85.5% of the original dataset.  So you would have to do

select top 94.7368 percent *
from (
select top 95 percent *
    from 
    order by .. ASC
) X
order by .. DESC


First create a view to match your table column names

create view main_table
as
select type unit, number as timeinminutes from master..spt_values


Try this instead

select Unit, COUNT(*), SUM(TimeInMinutes)
FROM
(
    select *,
        ROW_NUMBER() over (order by TimeInMinutes) rn,
        COUNT(*) over () countRows
    from main_table
) N -- Numbered
where rn between countRows * 0.05 and countRows * 0.95
group by Unit, N.countRows * 0.05, N.countRows * 0.95
having count(*) > 20


The HAVING clause is applied to the remaining set after removing outliers.
For a dataset of 1,1,1,1,1,1,2,5,6,19, the use of ROW_NUMBER allows you to correctly remove just one instance of the 1's.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它4个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复