How can I improve this 4 count self join?

久未见 提交于 2019-12-11 09:32:22

问题


Suppose I have the following sample data set:

emplid | Citizenship |
100001 | USA         |
100001 | CAN         |
100001 | CHN         |
100002 | USA         |
100002 | CHN         |
100003 | USA         |

And I want to arrange it to show the citizenships for each employee in one row. We can assume that an employee has up to four citizenships. The output would look like this:

emplid | Citizeship_1 | Citizenship_2 | Citizenship_3
100001 | USA          | CHN           | CAN
100002 | USA          | CHN           |
100003 | USA          |               |

The only working solution I have been able to achieve this this:

SELECT e.emplid, MAX(e.citizenship) AS citizenship1, 
                 MAX(e1.citizenship) AS citizenship2, 
                 MAX(e2.citizenship) AS citizenship3, 
                 MAX(e3.citizenship) AS citizenship4
FROM employee e
LEFT JOIN employee e1 ON e1.emplid = e.emplid AND e1.citizenship < e.citizenship
LEFT JOIN employee e2 ON e2.emplid = e1.emplid AND e2.citizenship < e1.citizenship
LEFT JOIN employee e3 ON e3.emplid = e2.emplid AND e3.citizenship < e2.citizenship
GROUP BY e.emplid

As the data set grows and grows this becomes more and more inefficient, but I can't find a way to rewrite this query.


回答1:


Why not just concatenate the citizenships into a list?

select e.emplid, group_concat(citizenship) as citizenships
from employee e
group by e.emplid;

If you want to have four separate columns, you can do something like:

select e.emplid,
       substring_index(group_concat(citizenship), ',', 1) as c1,
       (case when count(*) >= 2
             then substring_index(substring_index(group_concat(citizenship), ',', 2), ',', -1)
        end) as c2,
       (case when count(*) >= 3
             then substring_index(substring_index(group_concat(citizenship), ',', 3), ',', -1)
        end) as c3,
       (case when count(*) >= 4
             then substring_index(substring_index(group_concat(citizenship), ',', 4), ',', -1)
        end) as c4
from employee e
group by e.emplid;



回答2:


This solution ranks each employees citizenship alphabetically and then puts the results into the corresponding column.

SELECT 
    emplid,
    MAX(CASE WHEN R = 1 THEN Citizenship ELSE NULL END) AS Citizeship_1,
    MAX(CASE WHEN R = 2 THEN Citizenship ELSE NULL END) AS Citizeship_2,
    MAX(CASE WHEN R = 3 THEN Citizenship ELSE NULL END) AS Citizeship_3
FROM    
    (SELECT emplid,Citizenship,RANK() OVER(PARTITION BY emplid ORDER BY Citizenship) AS R FROM @T) AS DATA
GROUP BY
    emplid


来源:https://stackoverflow.com/questions/26786560/how-can-i-improve-this-4-count-self-join

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!