Grouping and counting

不羁岁月 提交于 2019-12-25 02:43:07

问题


I have a data set like this --

**Team  Date    W/L**
Team_1  04/01/0012  W
Team_1  06/01/0012  W
Team_1  07/01/0012  L
Team_1  14/01/0012  W
Team_1  19/01/0012  W
Team_1  30/01/0012  L
Team_1  14/02/0012  W
Team_1  17/02/0012  L
Team_1  20/02/0012  W
Team_2  01/01/0012  W
Team_2  05/01/0012  W
Team_2  09/01/0012  W
Team_2  13/01/0012  L
Team_2  18/01/0012  W
Team_2  25/01/0012  L
Team_2  05/02/0012  L
Team_2  13/02/0012  L
Team_2  19/02/0012  L
Team_3  02/01/0012  W
Team_3  02/01/0012  W
Team_3  06/01/0012  W
Team_3  10/01/0012  W
Team_3  19/01/0012  W
Team_3  31/01/0012  L
Team_3  11/02/0012  W
Team_3  15/02/0012  L
Team_3  21/02/0012  W

And from this I need to find out who had the biggest consecutive wins --

Team Count

Team_3  5
Team_2  3
Team_1  2

I am allowed to write only sql queries. How can I write this?


回答1:


You can use the following:

SELECT  Team, TotalWins, FirstWin, LastWin
FROM    (   SELECT  Team, 
                    WL,
                    COUNT(*) TotalWins,
                    MIN("Date") FirstWin,
                    MAX("Date") LastWin,
                    ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY COUNT(*) DESC) RowNumber
            FROM    (   SELECT  Team,
                                "Date",
                                WL, 
                                ROW_NUMBER() OVER(PARTITION BY Team ORDER BY "Date") - ROW_NUMBER() OVER(PARTITION BY Team, WL ORDER BY "Date") Grouping
                        FROM    T
                    ) GroupedData
            WHERE   WL = 'W'
            GROUP BY Team, WL, Grouping
        ) RankedData
WHERE   RowNumber = 1;

It uses ROW_NUMBER to rank each game partitioned by team, and also by result, the difference between these two is unique for each group of consecutive results. So for your first team you would have:

Team    Date        W/L RN1     RN2 DIFF
Team_1  04/01/0012  W   1       1   0
Team_1  06/01/0012  W   2       2   0
Team_1  07/01/0012  L   3       1   2
Team_1  14/01/0012  W   4       3   1
Team_1  19/01/0012  W   5       4   1
Team_1  30/01/0012  L   6       2   4
Team_1  14/02/0012  W   7       5   2
Team_1  17/02/0012  L   8       3   5
Team_1  20/02/0012  W   9       6   3

Where RN1 is just partitioned by team, and rn2 is partition by team and result.

As you can see, if You remove the Losses then the DIFF column increments by one for each group of consecutive victories:

Team    Date        W/L RN1     RN2 DIFF
Team_1  04/01/0012  W   1       1   0
Team_1  06/01/0012  W   2       2   0
---------------------------------------
Team_1  14/01/0012  W   4       3   1
Team_1  19/01/0012  W   5       4   1
---------------------------------------
Team_1  14/02/0012  W   7       5   2
---------------------------------------
Team_1  20/02/0012  W   9       6   3

You can then group by this to ensure you are looking at consecutive wins, and do a count to get the most. I've then just used another rownumber to get the maximum consecutive wins per team.

Example on SQL Fiddle



来源:https://stackoverflow.com/questions/18978212/grouping-and-counting

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!