I\'m creating a report comparing total time and volume across units. Here a simplification of the query I\'m using at the moment:
SELECT m.Unit,
COUNT(
I think the most robust way is to sort the list into order and then exclude the top and bottom extremes. For a hundred values, you would sort ascending and take the first 95 PERCENT, then sort descending and take the first 90 PERCENT.
NTile is quite inexact. If you run NTile against the sample view below, you will see that it catches some indeterminate number of rows instead of 90% from the center. The suggestion to use TOP 95%, then reverse TOP 90% is almost correct except that 90% x 95% gives you only 85.5% of the original dataset. So you would have to do
select top 94.7368 percent *
from (
select top 95 percent *
from
order by .. ASC
) X
order by .. DESC
First create a view to match your table column names
create view main_table
as
select type unit, number as timeinminutes from master..spt_values
Try this instead
select Unit, COUNT(*), SUM(TimeInMinutes)
FROM
(
select *,
ROW_NUMBER() over (order by TimeInMinutes) rn,
COUNT(*) over () countRows
from main_table
) N -- Numbered
where rn between countRows * 0.05 and countRows * 0.95
group by Unit, N.countRows * 0.05, N.countRows * 0.95
having count(*) > 20
The HAVING clause is applied to the remaining set after removing outliers. For a dataset of 1,1,1,1,1,1,2,5,6,19, the use of ROW_NUMBER allows you to correctly remove just one instance of the 1's.
You can exclude the top and bottom x percentiles with NTILE
SELECT m.Unit,
COUNT(*) AS Count,
SUM(m.TimeInMinutes) AS TotalTime
FROM
(SELECT
m.Unit,
NTILE(20) OVER (ORDER BY m.TimeInMinutes) AS Buckets
FROM
main_table m
WHERE
m.unit <> '' AND m.TimeInMinutes > 0
) m
WHERE
Buckets BETWEEN 2 AND 19
GROUP BY m.Unit
HAVING COUNT(*) > 15
Edit: this article has several techniques too
One way would be to exclude the outliers with a not in
clause:
where m.ID not in
(
select top 5 percent ID
from main_table
order by
TimeInMinutes desc
)
And another not in
clause for the bottom five percent.