Function to Calculate Median in SQL Server

前端 未结 30 2770
孤独总比滥情好
孤独总比滥情好 2020-11-22 04:03

According to MSDN, Median is not available as an aggregate function in Transact-SQL. However, I would like to find out whether it is possible to create this functionality (u

相关标签:
30条回答
  • 2020-11-22 04:16

    In a UDF, write:

     Select Top 1 medianSortColumn from Table T
      Where (Select Count(*) from Table
             Where MedianSortColumn <
               (Select Count(*) From Table) / 2)
      Order By medianSortColumn
    
    0 讨论(0)
  • 2020-11-22 04:16
    --Create Temp Table to Store Results in
    DECLARE @results AS TABLE 
    (
        [Month] datetime not null
     ,[Median] int not null
    );
    
    --This variable will determine the date
    DECLARE @IntDate as int 
    set @IntDate = -13
    
    
    WHILE (@IntDate < 0) 
    BEGIN
    
    --Create Temp Table
    DECLARE @table AS TABLE 
    (
        [Rank] int not null
     ,[Days Open] int not null
    );
    
    --Insert records into Temp Table
    insert into @table 
    
    SELECT 
        rank() OVER (ORDER BY DATEADD(mm, DATEDIFF(mm, 0, DATEADD(ss, SVR.close_date, '1970')), 0), DATEDIFF(day,DATEADD(ss, SVR.open_date, '1970'),DATEADD(ss, SVR.close_date, '1970')),[SVR].[ref_num]) as [Rank]
     ,DATEDIFF(day,DATEADD(ss, SVR.open_date, '1970'),DATEADD(ss, SVR.close_date, '1970')) as [Days Open]
    FROM
     mdbrpt.dbo.View_Request SVR
     LEFT OUTER JOIN dbo.dtv_apps_systems vapp 
     on SVR.category = vapp.persid
     LEFT OUTER JOIN dbo.prob_ctg pctg 
     on SVR.category = pctg.persid
     Left Outer Join [mdbrpt].[dbo].[rootcause] as [Root Cause] 
     on [SVR].[rootcause]=[Root Cause].[id]
     Left Outer Join [mdbrpt].[dbo].[cr_stat] as [Status]
     on [SVR].[status]=[Status].[code]
     LEFT OUTER JOIN [mdbrpt].[dbo].[net_res] as [net] 
     on [net].[id]=SVR.[affected_rc]
    WHERE
     SVR.Type IN ('P') 
     AND
     SVR.close_date IS NOT NULL 
     AND
     [Status].[SYM] = 'Closed'
     AND
     SVR.parent is null
     AND
     [Root Cause].[sym] in ( 'RC - Application','RC - Hardware', 'RC - Operational', 'RC - Unknown')
     AND
     (
      [vapp].[appl_name] in ('3PI','Billing Rpts/Files','Collabrent','Reports','STMS','STMS 2','Telco','Comergent','OOM','C3-BAU','C3-DD','DIRECTV','DIRECTV Sales','DIRECTV Self Care','Dealer Website','EI Servlet','Enterprise Integration','ET','ICAN','ODS','SB-SCM','SeeBeyond','Digital Dashboard','IVR','OMS','Order Services','Retail Services','OSCAR','SAP','CTI','RIO','RIO Call Center','RIO Field Services','FSS-RIO3','TAOS','TCS')
     OR
      pctg.sym in ('Systems.Release Health Dashboard.Problem','DTV QA Test.Enterprise Release.Deferred Defect Log')
     AND  
      [Net].[nr_desc] in ('3PI','Billing Rpts/Files','Collabrent','Reports','STMS','STMS 2','Telco','Comergent','OOM','C3-BAU','C3-DD','DIRECTV','DIRECTV Sales','DIRECTV Self Care','Dealer Website','EI Servlet','Enterprise Integration','ET','ICAN','ODS','SB-SCM','SeeBeyond','Digital Dashboard','IVR','OMS','Order Services','Retail Services','OSCAR','SAP','CTI','RIO','RIO Call Center','RIO Field Services','FSS-RIO3','TAOS','TCS')
     )
     AND
     DATEADD(mm, DATEDIFF(mm, 0, DATEADD(ss, SVR.close_date, '1970')), 0) = DATEADD(mm, DATEDIFF(mm,0,DATEADD(mm,@IntDate,getdate())), 0)
    ORDER BY [Days Open]
    
    
    
    DECLARE @Count AS INT
    SELECT @Count = COUNT(*) FROM @table;
    
    WITH MyResults(RowNo, [Days Open]) AS
    (
        SELECT RowNo, [Days Open] FROM
            (SELECT ROW_NUMBER() OVER (ORDER BY [Days Open]) AS RowNo, [Days Open] FROM @table) AS Foo
    )
    
    
    insert into @results
    SELECT 
     DATEADD(mm, DATEDIFF(mm,0,DATEADD(mm,@IntDate,getdate())), 0) as [Month]
     ,AVG([Days Open])as [Median] FROM MyResults WHERE RowNo = (@Count+1)/2 OR RowNo = ((@Count+1)%2) * ((@Count+2)/2) 
    
    
    set @IntDate = @IntDate+1
    DELETE FROM @table
    END
    
    select *
    from @results
    order by [Month]
    
    0 讨论(0)
  • 2020-11-22 04:17

    The following query returns the median from a list of values in one column. It cannot be used as or along with an aggregate function, but you can still use it as a sub-query with a WHERE clause in the inner select.

    SQL Server 2005+:

    SELECT TOP 1 value from
    (
        SELECT TOP 50 PERCENT value 
        FROM table_name 
        ORDER BY  value
    )for_median
    ORDER BY value DESC
    
    0 讨论(0)
  • 2020-11-22 04:20

    If you're using SQL 2005 or better this is a nice, simple-ish median calculation for a single column in a table:

    SELECT
    (
     (SELECT MAX(Score) FROM
       (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score) AS BottomHalf)
     +
     (SELECT MIN(Score) FROM
       (SELECT TOP 50 PERCENT Score FROM Posts ORDER BY Score DESC) AS TopHalf)
    ) / 2 AS Median
    
    0 讨论(0)
  • 2020-11-22 04:20

    For large scale datasets, you can try this GIST:

    https://gist.github.com/chrisknoll/1b38761ce8c5016ec5b2

    It works by aggregating the distinct values you would find in your set (such as ages, or year of birth, etc.), and uses SQL window functions to locate any percentile position you specify in the query.

    0 讨论(0)
  • 2020-11-22 04:21

    Even better:

    SELECT @Median = AVG(1.0 * val)
    FROM
    (
        SELECT o.val, rn = ROW_NUMBER() OVER (ORDER BY o.val), c.c
        FROM dbo.EvenRows AS o
        CROSS JOIN (SELECT c = COUNT(*) FROM dbo.EvenRows) AS c
    ) AS x
    WHERE rn IN ((c + 1)/2, (c + 2)/2);
    

    From the master Himself, Itzik Ben-Gan!

    0 讨论(0)
提交回复
热议问题