Function to Calculate Median in SQL Server

前端 未结 30 2805
孤独总比滥情好
孤独总比滥情好 2020-11-22 04:03

According to MSDN, Median is not available as an aggregate function in Transact-SQL. However, I would like to find out whether it is possible to create this functionality (u

相关标签:
30条回答
  • 2020-11-22 04:21

    I just came across this page while looking for a set based solution to median. After looking at some of the solutions here, I came up with the following. Hope is helps/works.

    DECLARE @test TABLE(
        i int identity(1,1),
        id int,
        score float
    )
    
    INSERT INTO @test (id,score) VALUES (1,10)
    INSERT INTO @test (id,score) VALUES (1,11)
    INSERT INTO @test (id,score) VALUES (1,15)
    INSERT INTO @test (id,score) VALUES (1,19)
    INSERT INTO @test (id,score) VALUES (1,20)
    
    INSERT INTO @test (id,score) VALUES (2,20)
    INSERT INTO @test (id,score) VALUES (2,21)
    INSERT INTO @test (id,score) VALUES (2,25)
    INSERT INTO @test (id,score) VALUES (2,29)
    INSERT INTO @test (id,score) VALUES (2,30)
    
    INSERT INTO @test (id,score) VALUES (3,20)
    INSERT INTO @test (id,score) VALUES (3,21)
    INSERT INTO @test (id,score) VALUES (3,25)
    INSERT INTO @test (id,score) VALUES (3,29)
    
    DECLARE @counts TABLE(
        id int,
        cnt int
    )
    
    INSERT INTO @counts (
        id,
        cnt
    )
    SELECT
        id,
        COUNT(*)
    FROM
        @test
    GROUP BY
        id
    
    SELECT
        drv.id,
        drv.start,
        AVG(t.score)
    FROM
        (
            SELECT
                MIN(t.i)-1 AS start,
                t.id
            FROM
                @test t
            GROUP BY
                t.id
        ) drv
        INNER JOIN @test t ON drv.id = t.id
        INNER JOIN @counts c ON t.id = c.id
    WHERE
        t.i = ((c.cnt+1)/2)+drv.start
        OR (
            t.i = (((c.cnt+1)%2) * ((c.cnt+2)/2))+drv.start
            AND ((c.cnt+1)%2) * ((c.cnt+2)/2) <> 0
        )
    GROUP BY
        drv.id,
        drv.start
    
    0 讨论(0)
  • 2020-11-22 04:22

    If you want to use the Create Aggregate function in SQL Server, this is how to do it. Doing it this way has the benefit of being able to write clean queries. Note this this process could be adapted to calculate a Percentile value fairly easily.

    Create a new Visual Studio project and set the target framework to .NET 3.5 (this is for SQL 2008, it may be different in SQL 2012). Then create a class file and put in the following code, or c# equivalent:

    Imports Microsoft.SqlServer.Server
    Imports System.Data.SqlTypes
    Imports System.IO
    
    <Serializable>
    <SqlUserDefinedAggregate(Format.UserDefined, IsInvariantToNulls:=True, IsInvariantToDuplicates:=False, _
      IsInvariantToOrder:=True, MaxByteSize:=-1, IsNullIfEmpty:=True)>
    Public Class Median
      Implements IBinarySerialize
      Private _items As List(Of Decimal)
    
      Public Sub Init()
        _items = New List(Of Decimal)()
      End Sub
    
      Public Sub Accumulate(value As SqlDecimal)
        If Not value.IsNull Then
          _items.Add(value.Value)
        End If
      End Sub
    
      Public Sub Merge(other As Median)
        If other._items IsNot Nothing Then
          _items.AddRange(other._items)
        End If
      End Sub
    
      Public Function Terminate() As SqlDecimal
        If _items.Count <> 0 Then
          Dim result As Decimal
          _items = _items.OrderBy(Function(i) i).ToList()
          If _items.Count Mod 2 = 0 Then
            result = ((_items((_items.Count / 2) - 1)) + (_items(_items.Count / 2))) / 2@
          Else
            result = _items((_items.Count - 1) / 2)
          End If
    
          Return New SqlDecimal(result)
        Else
          Return New SqlDecimal()
        End If
      End Function
    
      Public Sub Read(r As BinaryReader) Implements IBinarySerialize.Read
        'deserialize it from a string
        Dim list = r.ReadString()
        _items = New List(Of Decimal)
    
        For Each value In list.Split(","c)
          Dim number As Decimal
          If Decimal.TryParse(value, number) Then
            _items.Add(number)
          End If
        Next
    
      End Sub
    
      Public Sub Write(w As BinaryWriter) Implements IBinarySerialize.Write
        'serialize the list to a string
        Dim list = ""
    
        For Each item In _items
          If list <> "" Then
            list += ","
          End If      
          list += item.ToString()
        Next
        w.Write(list)
      End Sub
    End Class
    

    Then compile it and copy the DLL and PDB file to your SQL Server machine and run the following command in SQL Server:

    CREATE ASSEMBLY CustomAggregate FROM '{path to your DLL}'
    WITH PERMISSION_SET=SAFE;
    GO
    
    CREATE AGGREGATE Median(@value decimal(9, 3))
    RETURNS decimal(9, 3) 
    EXTERNAL NAME [CustomAggregate].[{namespace of your DLL}.Median];
    GO
    

    You can then write a query to calculate the median like this: SELECT dbo.Median(Field) FROM Table

    0 讨论(0)
  • 2020-11-22 04:22

    This works with SQL 2000:

    DECLARE @testTable TABLE 
    ( 
        VALUE   INT
    )
    --INSERT INTO @testTable -- Even Test
    --SELECT 3 UNION ALL
    --SELECT 5 UNION ALL
    --SELECT 7 UNION ALL
    --SELECT 12 UNION ALL
    --SELECT 13 UNION ALL
    --SELECT 14 UNION ALL
    --SELECT 21 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 29 UNION ALL
    --SELECT 40 UNION ALL
    --SELECT 56
    
    --
    --INSERT INTO @testTable -- Odd Test
    --SELECT 3 UNION ALL
    --SELECT 5 UNION ALL
    --SELECT 7 UNION ALL
    --SELECT 12 UNION ALL
    --SELECT 13 UNION ALL
    --SELECT 14 UNION ALL
    --SELECT 21 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 23 UNION ALL
    --SELECT 29 UNION ALL
    --SELECT 39 UNION ALL
    --SELECT 40 UNION ALL
    --SELECT 56
    
    
    DECLARE @RowAsc TABLE
    (
        ID      INT IDENTITY,
        Amount  INT
    )
    
    INSERT INTO @RowAsc
    SELECT  VALUE 
    FROM    @testTable 
    ORDER BY VALUE ASC
    
    SELECT  AVG(amount)
    FROM @RowAsc ra
    WHERE ra.id IN
    (
        SELECT  ID 
        FROM    @RowAsc
        WHERE   ra.id -
        (
            SELECT  MAX(id) / 2.0 
            FROM    @RowAsc
        ) BETWEEN 0 AND 1
    
    )
    
    0 讨论(0)
  • 2020-11-22 04:22

    The following solution works under these assumptions:

    • No duplicate values
    • No NULLs

    Code:

    IF OBJECT_ID('dbo.R', 'U') IS NOT NULL
      DROP TABLE dbo.R
    
    CREATE TABLE R (
        A FLOAT NOT NULL);
    
    INSERT INTO R VALUES (1);
    INSERT INTO R VALUES (2);
    INSERT INTO R VALUES (3);
    INSERT INTO R VALUES (4);
    INSERT INTO R VALUES (5);
    INSERT INTO R VALUES (6);
    
    -- Returns Median(R)
    select SUM(A) / CAST(COUNT(A) AS FLOAT)
    from R R1 
    where ((select count(A) from R R2 where R1.A > R2.A) = 
          (select count(A) from R R2 where R1.A < R2.A)) OR
          ((select count(A) from R R2 where R1.A > R2.A) + 1 = 
          (select count(A) from R R2 where R1.A < R2.A)) OR
          ((select count(A) from R R2 where R1.A > R2.A) = 
          (select count(A) from R R2 where R1.A < R2.A) + 1) ; 
    
    0 讨论(0)
  • 2020-11-22 04:24

    See other solutions for median calculation in SQL here: "Simple way to calculate median with MySQL" (the solutions are mostly vendor-independent).

    0 讨论(0)
  • 2020-11-22 04:25

    For a continuous variable/measure 'col1' from 'table1'

    select col1  
    from
        (select top 50 percent col1, 
        ROW_NUMBER() OVER(ORDER BY col1 ASC) AS Rowa,
        ROW_NUMBER() OVER(ORDER BY col1 DESC) AS Rowd
        from table1 ) tmp
    where tmp.Rowa = tmp.Rowd
    
    0 讨论(0)
提交回复
热议问题