Pick a random attribute from group in Redshift

后端未结

关注

 4  1502

攒了一身酷 2021-01-23 04:37

I have a data set in the form.

id  |   attribute
-----------------
1   |   a
2   |   b
2   |   a
2   |   a
3   |   c

Desired output:

4条回答

遥遥无期 (楼主)

2021-01-23 05:04

This is an answer for the related question here. That question is closed, so I am posting the answer here.

Here is a method to aggregate a column into a string:

select * from temp;
 attribute 
-----------
 a
 c
 b

1) Give a unique rank to each row

with sub_table as(select attribute, rank() over (order by attribute) rnk from temp)
select * from sub_table;

 attribute | rnk 
-----------+-----
 a         |   1
 b         |   2
 c         |   3

2) Use concat operator || to combine in one line

with sub_table as(select attribute, rank() over (order by attribute) rnk from temp)
select (select attribute from sub_table where rnk = 1)||
       (select attribute from sub_table where rnk = 2)||
       (select attribute from sub_table where rnk = 3) res_string;

 res_string 
------------
 abc

This only works for a finite numbers of rows (X) in that column. It can be the first X rows ordered by some attribute in the "order by" clause. I'm guessing this is expensive.

Case statement can be used to deal with NULLs which occur when a certain rank does not exist.

with sub_table as(select attribute, rank() over (order by attribute) rnk from temp)
select (select attribute from sub_table where rnk = 1)||
       (select attribute from sub_table where rnk = 2)||
       (select attribute from sub_table where rnk = 3)||
       (case when (select attribute from sub_table where rnk = 4) is NULL then '' 
             else (select attribute from sub_table where rnk = 4) end) as res_string;

0 讨论(0)

查看其它4个回答