How to summarize all possible combinations of variables?

后端未结

关注

 3  646

北恋 2021-02-11 11:44

I am trying to summarize the count based on the all possible combinations of variables. Here is an example data:

3条回答

误落风尘 (楼主)

2021-02-11 12:05

For this sort of query using some of the built in aggregate tools is quite straight forward.

First off setup some sample data based on your sample image:

declare @Table1 as table
    ([id] int, [a] int, [b] int, [c] int)
;

INSERT INTO @Table1
    ([id], [a], [b], [c])
VALUES
    (10001, 1, 3, 3),
    (10002, 0, 0, 0),
    (10003, 3, 6, 0),
    (10004, 7, 0, 0),
    (10005, 0, 0, 0)
;

Since you want the count of IDs for each possible combination of non zero attributes A, B, and C, the first step is eliminate the zeros and convert the non zero values to a single value we can summarize on, in this case I'll use the attributes name. After that it's a simple matter of performing the aggregate, using the CUBE clause in the group by statement to generate the combinations. Lastly in the having clause prune out the unwanted summations. Mostly that's just ignoring the null values in the attributes, and optionally removing the grand summary (count of all rows)

with t1 as (
select case a when 0 then null else 'a' end a
     , case b when 0 then null else 'b' end b
     , case c when 0 then null else 'c' end c
     , id
  from @Table1
)
select a, b, c, count(id) cnt
  from t1
  group by cube(a,b,c)
  having (a is not null or grouping(a) = 1) -- For each attribute
     and (b is not null or grouping(b) = 1) -- only allow nulls as
     and (c is not null or grouping(c) = 1) -- a result of grouping.
     and grouping_id(a,b,c) <> 7  -- exclude the grand total
  order by grouping_id(a,b,c);

Here are the results:

    a       b       c       cnt
1   a       b       c       1
2   a       b       NULL    2
3   a       NULL    c       1
4   a       NULL    NULL    3
5   NULL    b       c       1
6   NULL    b       NULL    2
7   NULL    NULL    c       1

And finally my original rextester link: http://rextester.com/YRJ10544

@lad2025 Here's a dynamic version (sorry my SQL Server skills aren't as strong as my Oracle skills, but it works). Just set the correct values for @Table and @col and it should work as long as all other columns are numeric attributes:

declare @sql varchar(max), @table varchar(30), @col varchar(30);
set @table = 'Table1';
set @col = 'id';
with x(object_id, column_id, name, names, proj, pred, max_col, cnt) 
  as (
    select object_id, column_id, name, cast(name as varchar(max))
     , cast('case '+name+' when 0 then null else '''+name+''' end '+name as varchar(4000))
     , cast('('+name+' is not null or grouping('+name+') = 1)' as varchar(4000))
     , (select max(column_id) from sys.columns m where m.object_id = c.object_id and m.name <>'ID')
     , 1
     from sys.columns c
    where object_id = OBJECT_ID(@Table)
      and column_id = (select min(column_id) from sys.columns m where m.object_id = c.object_id and m.name <> @col)
    union all
    select x.object_id, c.column_id, c.name, cast(x.names+', '+c.name as varchar(max))
     , cast(proj+char(13)+char(10)+'     , case '+c.name+' when 0 then null else '''+c.name+''' end '+c.name as varchar(4000))
     , cast(pred+char(13)+char(10)+'   and ('+c.name+' is not null or grouping('+c.name+') = 1)' as varchar(4000))
     , max_col
     , cnt+1
      from x join sys.columns c on c.object_id = x.object_id and c.column_id = x.column_id+1
)
select @sql='with t1 as (
select '+proj+'
     , '+@col+'
  from '+@Table+'
)
select '+names+'
     , count('+@col+') cnt 
  from t1
 group by cube('+names+')
having '+pred+'
   and grouping_id('+names+') <> '+cast(power(2,cnt)-1 as varchar(10))+'
 order by grouping_id('+names+');'
  from x where column_id = max_col;

select @sql sql;
exec (@sql);

Rextester

0 讨论(0)

查看其它3个回答