I\'m creating a report comparing total time and volume across units. Here a simplification of the query I\'m using at the moment:
SELECT m.Unit,
COUNT(
NTile is quite inexact. If you run NTile against the sample view below, you will see that it catches some indeterminate number of rows instead of 90% from the center. The suggestion to use TOP 95%, then reverse TOP 90% is almost correct except that 90% x 95% gives you only 85.5% of the original dataset. So you would have to do
select top 94.7368 percent *
from (
select top 95 percent *
from
order by .. ASC
) X
order by .. DESC
First create a view to match your table column names
create view main_table
as
select type unit, number as timeinminutes from master..spt_values
Try this instead
select Unit, COUNT(*), SUM(TimeInMinutes)
FROM
(
select *,
ROW_NUMBER() over (order by TimeInMinutes) rn,
COUNT(*) over () countRows
from main_table
) N -- Numbered
where rn between countRows * 0.05 and countRows * 0.95
group by Unit, N.countRows * 0.05, N.countRows * 0.95
having count(*) > 20
The HAVING clause is applied to the remaining set after removing outliers. For a dataset of 1,1,1,1,1,1,2,5,6,19, the use of ROW_NUMBER allows you to correctly remove just one instance of the 1's.