Find start and end dates when one field changes

旧街凉风 提交于 2019-12-22 09:57:48

问题


I have this data in a table

FIELD_A   FIELD_B     FIELD_D
249052903   10/15/2011 N
249052903   11/15/2011 P ------------- VALUE CHANGED
249052903   12/15/2011 P
249052903   1/15/2012   N ------------- VALUE CHANGED
249052903   2/15/2012   N
249052903   3/15/2012   N
249052903   4/15/2012   N
249052903   5/15/2012   N
249052903   6/15/2012   N
249052903   7/15/2012   N
249052903   8/15/2012   N
249052903   9/15/2012   N

When ever the value in FIELD_D changes it forms a group and I need the min and max dates in that group. The query shoud return

FIELD_A   GROUP_START   GROUP_END
249052903   10/15/2011  10/15/2011
249052903   11/15/2011  12/15/2011
249052903   1/15/2012              9/15/2012

The examples that I have seen so far have the data in Field_D being unique. Here the data can repeat as shown, First it is "N" then it changes to "P" and then back to "N".

Any help will be appreciated

Thanks


回答1:


You can use analytic functions - LAG, LEAD, and COUNT() OVER to your advantage, if they are supported by your SQL implementation. SQL Fiddle here.

WITH EndsMarked AS (
  SELECT
    FIELD_A,
    FIELD_B,
    CASE WHEN FIELD_D = LAG(FIELD_D,1) OVER (ORDER BY FIELD_B)
         THEN 0 ELSE 1 END AS IS_START,
    CASE WHEN FIELD_D = LEAD(FIELD_D,1) OVER (ORDER BY FIELD_B)
         THEN 0 ELSE 1 END AS IS_END
  FROM T
), GroupsNumbered AS (
  SELECT
    FIELD_A,
    FIELD_B,
    IS_START,
    IS_END,
    COUNT(CASE WHEN IS_START = 1 THEN 1 END)
      OVER (ORDER BY FIELD_B) AS GroupNum
  FROM EndsMarked
  WHERE IS_START=1 OR IS_END=1
)
  SELECT
    FIELD_A,
    MIN(FIELD_B) AS GROUP_START,
    MAX(FIELD_B) AS GROUP_END
    FROM GroupsNumbered
    GROUP BY FIELD_A, GroupNum;



回答2:


This is fairly easy to express in SQL using subqueries:

select Field_A, Field_D, min(Field_B) as Group_Start, max(Field_B) as Group_End
from (select t.*,
             (select min(field_B)
              from t t2
              where t2.field_A = t.field_A and
                    t2.field_B > t.field_B and
                    t2.Field_D <> t.field_D
             ) as TheGroup
      from t
     ) t
group by Field_A, Field_D, TheGroup

This is assigning a group identifier using a correlated subquery. The identifier is the first value of Field_B where Field_D changes.

You don't mention the database you are using, so this uses standard SQL.




回答3:


Don't use SQL for this problem because it is not possible to do it in SQL with a single table scan since it requires comparison between records. It would need a full table scan plus at least a join with itself. It is trivial to implement a solution in a imperative language and it only requires a single table scan. Edit: a stored procedure would be best.



来源:https://stackoverflow.com/questions/15649530/find-start-and-end-dates-when-one-field-changes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!