Create a user defined function that works with GROUP BY in mysql

[亡魂溺海] 提交于 2020-06-23 14:36:10

问题


I'm trying to create an aggregated function MEDIAN() in MySQL like MIN(), MAX(), AVG() which takes the input the column name or string that has concatenated values of the desired column.

I'm having trouble understanding the limitations of MySQL custom functions & would be really helpful if some can help me find out how this is done.

Example:

MySQL table has 2 columns (ID, num)

+----+-----+
| id | num |
+----+-----+
|  1 |   5 |
|  1 |   6 |
|  1 |   7 |
|  2 |   1 |
|  2 |   3 |
|  2 |   5 |
+----+-----+

SELECT id, MEDIAN(num) as median
FROM table
GROUP BY id;

OR

SELECT id, MEDIAN(GROUP_CONCAT(num SEPARATOR ',') as median
FROM table
GROUP BY id;

Expected Output is

+----+--------+
| id | median |
+----+--------+
|  1 |      6 |
|  2 |      3 |
+----+--------+

回答1:


User defined aggregate stored functions were added in MariaDB-10.3.3

MySQL can do aggregate functions however not in SQL. They need a UDF (shared library implemenation)




回答2:


It is possible to do without a UDF, and I know of two ways to do it. The first uses two selects and a join, the first select to get the values and rankings, and the second select to get the counts, then joins them. The second uses json functions to get everything in one select. They are both a little lengthy, but they work and are reasonably fast.

SOLUTION #1 (two selects and a join, one to get counts, one to get rankings)

SELECT  x.group_field, 
        avg(
            if( 
                x.rank - y.vol/2 BETWEEN 0 AND 1, 
                value_field, 
                null
            )
        ) as median
FROM (
    SELECT  group_field, value_field, 
            @r:= IF(@current=group_field, @r+1, 1) as rank, 
            @current:=group_field
    FROM (
        SELECT group_field, value_field
        FROM table_name
        ORDER BY group_field, value_field
    ) z, (SELECT @r:=0, @current:='') v
) x, (
    SELECT group_field, count(*) as vol 
    FROM table_name
    GROUP BY group_field
) y WHERE x.group_field = y.group_field
GROUP BY x.group_field;

SOLUTION #2 (uses a json object to store the counts and avoids the join)

SELECT group_field, 
    avg(
        if(
            rank - json_extract(@vols, path)/2 BETWEEN 0 AND 1,
            value_field,
            null
        )
    ) as median
FROM (
    SELECT group_field, value_field, path, 
        @rnk := if(@curr = group_field, @rnk+1, 1) as rank,
        @vols := json_set(
            @vols, 
            path, 
            coalesce(json_extract(@vols, path), 0) + 1
        ) as vols,
        @curr := group_field
    FROM (
        SELECT p.group_field, p.value_field, concat('$.', p.group_field) as path
        FROM table_name
        JOIN (SELECT @curr:='', @rnk:=1, @vols:=json_object()) v
        ORDER BY group_field, value_field DESC
    ) z
) y GROUP BY group_field;


来源:https://stackoverflow.com/questions/53714212/create-a-user-defined-function-that-works-with-group-by-in-mysql

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!