MySQL SUM json values grouped by json keys

前端 未结 4 1384
情深已故
情深已故 2021-01-01 04:36

Is it possible to calculate sum of json values grouped by json keys?

Mysql version is 5.7.17 on Google cloud sql.

Example_1: A short exampl

相关标签:
4条回答
  • 2021-01-01 04:46

    SQL for the "short example":

    SELECT col1,
           JSON_OBJECT('key1', SUM(value1), 'key2', SUM(value2)) AS col2
    FROM
        (SELECT col1,
                JSON_EXTRACT(col2, '$.key1') AS value1,
                JSON_EXTRACT(col2, '$.key2') AS value2
         FROM tbl) subq
    GROUP BY col1;
    
    0 讨论(0)
  • 2021-01-01 04:48

    I believe that something like this could work.

    SELECT SUM(col2->>"$.key1"), SUM(col2->>"$.key2") FROM your_table GROUP BY col1
    
    0 讨论(0)
  • 2021-01-01 04:50

    Solution for Example_3:

    DROP TABLE IF EXISTS jsondata;
    CREATE TABLE jsondata (json JSON, col varchar(11));
    
    INSERT INTO jsondata VALUES
    ('[{"key_name" : "key1", "key_value" : 1}, {"key_name" : "key2", "key_value" : 3}]', 'aaa'),
    ('[{"key_name" : "key1", "key_value" : 0}, {"key_name" : "key3", "key_value" : 2}]', 'bbb'),
    ('[{"key_name" : "key1", "key_value" : 50}, {"key_name" : "key2", "key_value" : 0}]', 'aaa');
    
    DROP FUNCTION IF EXISTS json_sum_by_col;
    CREATE FUNCTION json_sum_by_col(col varchar(100)) RETURNS JSON
    BEGIN
        DECLARE i INT DEFAULT 0;
        DECLARE done INT DEFAULT FALSE;
        DECLARE select_values JSON;
        DECLARE temp_result JSON;
        DECLARE json_result JSON DEFAULT '[]';
        DECLARE temp_key varchar(11);
        DECLARE temp_value int;
    
        DECLARE curs CURSOR FOR SELECT json FROM jsondata WHERE jsondata.col = col;
        DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
    
        OPEN curs;
            read_loop: LOOP
                SET i = 0;
                FETCH curs INTO select_values;
    
                IF done THEN
                  LEAVE read_loop;
                END IF;
    
                WHILE i < JSON_LENGTH(select_values) DO
                    -- extract key and value for i element
                    SET temp_key = JSON_EXTRACT(JSON_EXTRACT(select_values, CONCAT('$[',i,']')), '$.key_name');
                    SET temp_value = JSON_EXTRACT(JSON_EXTRACT(select_values, CONCAT('$[',i,']')), '$.key_value');
    
                    -- search json_result for key
                    SET @search = JSON_SEARCH(json_result, 'one', JSON_UNQUOTE(temp_key));
                    IF @search IS NOT NULL THEN
                        -- if exists add to existing value
                        SET @value_path = JSON_UNQUOTE(REPLACE(@search, 'name', 'value'));
                        SET temp_value = temp_value + JSON_EXTRACT(json_result, @value_path);
                        SET json_result = JSON_REPLACE(json_result, @value_path, temp_value);
                    ELSE
                        -- else attach it to json_result
                        SET temp_result = JSON_OBJECT("key_name", JSON_UNQUOTE(temp_key), "key_value", temp_value);
                        SET json_result = JSON_INSERT(json_result, CONCAT('$[',JSON_LENGTH(json_result),']'), temp_result);
                    END IF;
    
                    SELECT i + 1 INTO i;
                END WHILE;
            END LOOP;
        CLOSE curs;
    
        RETURN json_result;
    END;
    
    SELECT col, json_sum_by_col(col) FROM jsondata GROUP BY col;
    

    You can run it here

    0 讨论(0)
  • 2021-01-01 04:55

    TL;DR: yes, it can be done without knowing the key names in advance, and none of the alternate data formats has any advantage over the original.

    This can be done without knowing the key names in advance but it is painful... basically you have to look at every value in the table to determine the set of distinct keys in the table before you can sum them. Because of this requirement, and the fact that the alternate data formats can all have multiple keys per entry, there is no advantage to using any of them.

    Since you have to look for all the distinct keys, it is as easy to do the sums while you are looking for them. This function and procedure together will do that. The function, json_merge_sum, takes two JSON values and merges them, summing the values where a key appears in both values e.g.

    SELECT json_sum_merge('{"key1": 1, "key2": 3}', '{"key3": 1, "key2": 2}')
    

    Output:

    {"key1": 1, "key2": 5, "key3": 1}
    

    The function code:

    DELIMITER //
    DROP FUNCTION IF EXISTS json_merge_sum //
    CREATE FUNCTION json_sum_merge(IN j1 JSON, IN total JSON) RETURNS JSON
    BEGIN
      DECLARE knum INT DEFAULT 0;
      DECLARE jkeys JSON DEFAULT JSON_KEYS(j1);
      DECLARE kpath VARCHAR(20);
      DECLARE v INT;
      DECLARE l INT DEFAULT JSON_LENGTH(jkeys);
      kloop: LOOP
        IF knum >= l THEN
          LEAVE kloop;
        END IF;
        SET kpath = CONCAT('$.', JSON_EXTRACT(jkeys, CONCAT('$[', knum, ']')));
        SET v = JSON_EXTRACT(j1, kpath);
        IF JSON_CONTAINS_PATH(total, 'one', kpath) THEN
          SET total = JSON_REPLACE(total, kpath, JSON_EXTRACT(total, kpath) + v);
        ELSE
          SET total = JSON_SET(total, kpath, v);
        END IF;
        SET knum = knum + 1;
      END LOOP kloop;
      RETURN total;
    END
    

    The procedure, count_keys, performs the equivalent of the GROUP BY clause. It finds all the distinct values of col1 in the table and then calls json_sum_merge for each row which has that value of col1. Note the row select query performs a SELECT ... INTO a dummy variable so no output is generated, and uses a MIN() to ensure there is only one result (so that it can be assigned to a variable).

    The procedure:

    DELIMITER //
    DROP PROCEDURE IF EXISTS count_keys //
    CREATE PROCEDURE count_keys()
    BEGIN
      DECLARE finished INT DEFAULT 0;
      DECLARE col1val VARCHAR(20);
      DECLARE col1_cursor CURSOR FOR SELECT DISTINCT col1 FROM table2;
      DECLARE CONTINUE HANDLER FOR NOT FOUND SET finished=1;
      OPEN col1_cursor;
      col1_loop: LOOP
        FETCH col1_cursor INTO col1val;
        IF finished=1 THEN
          LEAVE col1_loop;
        END IF;
        SET @total = '{}';
        SET @query = CONCAT("SELECT MIN(@total:=json_sum_merge(col2, @total)) INTO @json FROM table2 WHERE col1='", col1val, "'");
        PREPARE stmt FROM @query;
        EXECUTE stmt;
        DEALLOCATE PREPARE stmt;
        SELECT col1val AS col1, @total AS col2;
      END LOOP col1_loop;
    END
    

    For a slightly larger example:

    col1    col2    
    aaa     {"key1": 1, "key2": 3}
    bbb     {"key1": 4, "key2": 2}
    aaa     {"key1": 50, "key3": 0}
    ccc     {"key2": 5, "key3": 1, "key4": 3}
    bbb     {"key1": 5, "key2": 1, "key5": 3}
    

    CALL count_keys() produces:

    col1    col2    
    aaa     {"key1": 51, "key2": 3, "key3": 0}
    bbb     {"key1": 9, "key2": 3, "key5": 3}
    ccc     {"key2": 5, "key3": 1, "key4": 3}
    

    Note I've called the table table2 in the procedure, you will need to edit that (in both queries) to suit.

    0 讨论(0)
提交回复
热议问题