How to find consecutive rows based on the value of a column?

前端 未结 2 1597
清酒与你
清酒与你 2020-12-01 04:50

I have some data. I want to group them based on the value of data column. If there are 3 or more consecutive rows that have data bigger than 10, then those rows

相关标签:
2条回答
  • 2020-12-01 05:21

    Try this

    WITH cte
    AS
    (
        SELECT *,COUNT(1) OVER(PARTITION BY cnt) pt  FROM
        (
            SELECT tt.*
               ,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
            FROM  t tt
            WHERE data > 10
        ) t1
    )
    
    SELECT id, [when], data FROM cte WHERE pt >= 3
    

    SQL FIDDLE DEMO

    OUTPUT

    id  when                    data
    2   2013-08-02 00:00:00.000 121
    3   2013-08-03 00:00:00.000 132
    4   2013-08-04 00:00:00.000 15
    6   2013-08-06 00:00:00.000 1435
    7   2013-08-07 00:00:00.000 143
    8   2013-08-08 00:00:00.000 18
    9   2013-08-09 00:00:00.000 19
    

    EDIT

    First the inner query counts the no of records where data <= 10

    SELECT tt.*
         ,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
    FROM  t tt
    

    output

    id  when                    data   cnt
    1   2013-08-01 00:00:00.000 1       1
    2   2013-08-02 00:00:00.000 121     1
    3   2013-08-03 00:00:00.000 132     1
    4   2013-08-04 00:00:00.000 15      1
    5   2013-08-05 00:00:00.000 9       2
    6   2013-08-06 00:00:00.000 1435    2
    7   2013-08-07 00:00:00.000 143     2
    8   2013-08-08 00:00:00.000 18      2
    9   2013-08-09 00:00:00.000 19      2
    10  2013-08-10 00:00:00.000 1       3
    11  2013-08-11 00:00:00.000 1234    3
    12  2013-08-12 00:00:00.000 124     3
    13  2013-08-13 00:00:00.000 6       4
    

    Then we filter the records with data > 10

    WHERE data > 10
    

    Now we count the records by partitoning cnt column

    SELECT *,COUNT(1) OVER(PARTITION BY cnt) pt  FROM
    (
        SELECT tt.*
            ,(SELECT COUNT(id) FROM t WHERE data <= 10 AND ID < tt.ID) AS cnt
        FROM  t tt
        WHERE data > 10
    ) t1
    

    Output

    id  when    data                   cnt  pt
    2   2013-08-02 00:00:00.000 121     1   3
    3   2013-08-03 00:00:00.000 132     1   3
    4   2013-08-04 00:00:00.000 15      1   3
    6   2013-08-06 00:00:00.000 1435    2   4
    7   2013-08-07 00:00:00.000 143     2   4
    8   2013-08-08 00:00:00.000 18      2   4
    9   2013-08-09 00:00:00.000 19      2   4
    11  2013-08-11 00:00:00.000 1234    3   2
    12  2013-08-12 00:00:00.000 124     3   2
    

    The above query is put in cte just like temp table

    Now select the records that are having the consecutive count >= 3

    SELECT id, [when], data FROM cte WHERE pt >= 3
    

    ANOTHER SOLUTION

    ;WITH partitioned AS (
      SELECT *, id - ROW_NUMBER() OVER (ORDER BY id) AS grp
      FROM t
      WHERE data > 10
    ),
    counted AS (
      SELECT *, COUNT(*) OVER (PARTITION BY grp) AS cnt
      FROM partitioned
    )
    
    SELECT id, [when], data
    FROM counted
    WHERE cnt >= 3
    

    Reference URL

    SQL FIDDLE DEMO

    0 讨论(0)
  • 2020-12-01 05:33

    First, we discount any row that has a value of 10 or less:

    WITH t10 AS (SELECT * FROM t WHERE data > 10),
    

    Next, get the rows whose immediate predecessor is also more than 10:

    okleft AS (SELECT t10.*, pred.id AS predid FROM
       t10
       INNER JOIN t pred ON 
            pred.[when] < t10.[when]
            AND pred.[when] >= ALL (SELECT [when] FROM t t2 WHERE t2.[when] < t10.[when])
       WHERE pred.data > 10
    ),
    

    Also get the rows whose immediate successor is also more than 10:

    okright as (SELECT t10.*, succ.id AS succid FROM
       t10
       INNER JOIN t succ ON
            succ.[when] > t10.[when] 
            AND succ.[when] <= ALL (SELECT [when] FROM t t2 WHERE t2.[when] > t10.[when])
       WHERE succ.data > 10
    ),
    

    Finally, select any row where it either starts a sequence of 3, is in the middle of one, or ends one:

    A row whose valid right side also has a valid right side starts a sequence of at least 3:

    starts3 AS (SELECT id, [when], data FROM okright r1 WHERE EXISTS(
    SELECT NULL FROM okright r2 WHERE r2.id = r1.succid)),
    

    A row whose predecessor and successor are both valid is in the middle of at least 3:

    mid3 AS (SELECT id, [when], data FROM okleft l WHERE EXISTS(
    SELECT NULL FROM okright r WHERE r.id = l.id)),
    

    A row whose valid left side also has a valid left side ends a sequence of at least 3:

    ends3 AS (SELECT id, [when], data FROM okleft l1 WHERE EXISTS(
    SELECT NULL FROM okleft l2 WHERE l2.id = l1.predid))
    

    Join them all up, with UNION to remove duplicates:

    SELECT * FROM starts3
    UNION SELECT * FROM mid3
    UNION SELECT * FROM ends3
    

    SQL Fiddler: http://sqlfiddle.com/#!3/12f3a/9

    Edit: I like BVR's answer, much more elegant than mine.

    0 讨论(0)
提交回复
热议问题