Redshift SQL: add and reset a counter with date and group considered

自古美人都是妖i 提交于 2020-01-24 20:50:07

问题


Suppose I have a table below. I'd like to have a counter to count the # of times when a Customer (there are many) is in Segment A. If the Customer jumps to a different Segment between 2 quarters, the counter will reset when the Customer jumps back to Segment A. I am sure there are many ways to do it, but I just can't figure this out..Please help. Thank you!

Quarter    Segment    Customer    *Counter*
Q1 2018    A          A1          1
Q2 2018    A          A1          2
Q3 2018    A          A1          3
Q4 2018    B          A1          1
Q1 2019    B          A1          2
Q2 2019    A          A1          1
Q1 2020    A          A1          *1* I want 1 not 2 here because it's not consecutive

回答1:


This is a type of gaps-and-islands problem. You can solve this with a difference of row numbers. The real problem is dealing with the quarters. But string functions can handle that.

select quarter, customer, segment,
       row_number() over (partition by customer, segment, seqnum - seqnum_cs order by right(quarter, 4), left(quarter, 2)) as counter
from (select t.*,
             row_number() over (partition by customer order by right(quarter, 4), left(quarter, 2)) as seqnum,
             row_number() over (partition by customer, segment order by right(quarter, 4), left(quarter, 2)) as seqnum_cs
      from t
     ) t
order by customer, seqnum;

The key idea here is that the difference of row numbers defines the adjacent rows for a customer with the same status. It can be a bit hard to see why this is the case. However, if you look at the results of the subquery, you will no doubt see and understand why this is works.



来源:https://stackoverflow.com/questions/59402755/redshift-sql-add-and-reset-a-counter-with-date-and-group-considered

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!