mysql Count multiple occurrences of multiplexed entries

前端 未结 3 1132
甜味超标
甜味超标 2021-01-28 23:54

I have a mysql table as shown below,

  id          reasonCode
------      ------------------
  1           0, 1
  2           0
  3           1, 2, 3
  4                 


        
相关标签:
3条回答
  • 2021-01-29 00:37

    If you have a table of reason codes (I presume you do, so you know what the meaning is of reason code 1 for example) then you could just do something like this:-

    SELECT a.id, COUNT(b.id)
    FROM reason_codes a
    LEFT OUTER JOIN id_reason_code b
    ON FIND_IN_SET(a.id, b.reasonCode)
    GROUP BY a.id
    

    However one problem with this is that you have spaces after the commas. A comma separated field is a problem at the best of times (better split off into multiple rows of another table - easy enough to concatenate them together afterwards if needs be), but the spaces after the commas will give issues (note that removing these spaces would also make the solution by @Vignesh Kumar a bit simpler).

    To get round this you could do:-

    SELECT a.id, COUNT(b.id)
    FROM reason_codes a
    LEFT OUTER JOIN id_reason_code b
    ON FIND_IN_SET(a.id, REPLACE(b.reasonCode, ' ', ''))
    GROUP BY a.id
    

    EDIT - Explanation

    It is just a LEFT OUTER JOIN. This will take every row from the first table (ie, reason codes), and match that against any row that matches on the 2nd table (ie, ie_reason_code - not sure what your table is called that you show above); if there are no matching rows on the 2nd table then the row from the first table is still brought back but with NULL in the columns from the 2nd table. In this case the join is done based on FIND_IN_SET. This looks for the first parameter in a list of comma separated values and returns the position if found (hence if found it evaluates to true).

    The COUNT / GROUP BY then counts the number of values of b.id for each a.id and presents that count.

    The 2nd query is doing the same, but it is removing any spaces from the comma separated list before checking for values (required when you have a space as well as a comma separating the values).

    If you had the following tables:-

    reason_codes table
    id  reason
    0   Reason A
    1   Reason B
    2   Reason C
    3   Reason D
    4   Reason E
    
    id_reason_code table
    id          reasonCode
    1           0,1
    2           0
    3           1,2,3
    4           2
    5           1,0
    

    then the following sql (removing the COUNT / GROUP BY):-

    SELECT a.id, b.id
    FROM reason_codes a
    LEFT OUTER JOIN id_reason_code b
    ON FIND_IN_SET(a.id, b.reasonCode)
    

    would give something like the following:-

    a.id    b.id
    0       1
    0       2
    0       5
    1       1
    1       3
    1       5
    2       3
    2       4
    3       3
    4       NULL
    

    Running:-

    SELECT a.id, COUNT(b.id)
    FROM reason_codes a
    LEFT OUTER JOIN id_reason_code b
    ON FIND_IN_SET(a.id, b.reasonCode)
    GROUP BY a.id
    

    the COUNT / GROUP BY is giving one row for each value of a.id, and then a count of the values (non null) of b.id for that value of a.id:-

    a.id    count(b.id)
    0       3
    1       3
    2       2
    3       1
    4       0
    

    You could also bring back the actual reason instead of the code if you wanted:-

    SELECT a.id, a.reason, COUNT(b.id)
    FROM reason_codes a
    LEFT OUTER JOIN id_reason_code b
    ON FIND_IN_SET(a.id, b.reasonCode)
    GROUP BY a.id, a.reason
    

    giving:-

    a.id    a.reason    count(b.id)
    0       Reason A    3
    1       Reason B    3
    2       Reason C    2
    3       Reason D    1
    4       Reason E    0
    
    0 讨论(0)
  • 2021-01-29 00:48

    ...the num of reason codes are going to change. User can add/remove reason codes whenever he wants...

    If what numbers would be there as part of reason codes is not known, then you better generate the query dynamic. You can do this by a stored procedure.

    Steps to follow:

    1. Fetch each reason code string in to a variable.
    2. Split it to find each of the reason codes.
    3. Generate a select with found code and its count as 1.
    4. Union all all such statements if generated some.
    5. Loop until no more codes are present in each string.
    6. Repeat until all rows are processed
    7. Now, run an aggregate function on the generated result sets group by reason code.
    8. You have the results in hand.

    Part of sample code snippet:

    -- ...
    
    set @sql_query := 'select reason_code, sum(rc_count) as rc_count from (' ;
    set @sql_query := 
           concat( @sql_query, 
                   '\n  ( select null as reason_code, 0 as rc_count )' );
    
    -- ...
    
    splitting_reason_codes: loop
      set comma_position = locate( ',', reason_code_string );
      if comma_position then
        set rc := substring( reason_code_string, 1, comma_position-1 );
        set reason_code_string := 
                substring( reason_code_string, comma_position+1 );
      else
        set rc := reason_code_string;
      end if;
    
      if length( rc ) > 0 then
        set @sql_query := 
                concat( @sql_query, 
                        '\n   union all ( select ', rc, ', 1 )' );
      end if;
    
      if ! comma_position then
        leave splitting_reason_codes;
      end if;
    end loop splitting_reason_codes;
    
    -- ...
    
    set @sql_query := concat( @sql_query, '\n) unique_reason_codes' );
    set @sql_query := concat( @sql_query, '\nwhere reason_code is not null' );
    set @sql_query := concat( @sql_query, '\ngroup by reason_code' );
    set @sql_query := concat( @sql_query, '\norder by reason_code' );
    
    prepare stmt from @sql_query;
    execute stmt;
    

    Demo @ SQL Fiddle

    0 讨论(0)
  • 2021-01-29 00:59

    Try this query

    SELECT Reason,COUNT(Reason) FROM
    (
    SELECT
      id,
      SUBSTRING_INDEX(SUBSTRING_INDEX(reasoncode, ',', n.digit+1), ',', -1) Reason
    FROM
      table1
      INNER JOIN
      (SELECT 0 digit UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3) n
      ON LENGTH(REPLACE(reasoncode, ',' , '')) <= LENGTH(reasoncode)-n.digit
    ORDER BY
      id,
      n.digit
    ) T
    
    Group By Reason;
    

    SQL FIDDLE

    Output Would be:

    REASON  OCCURANCES
    0           3
    1           3
    2           2
    3           1
    
    0 讨论(0)
提交回复
热议问题