Querying sequences of rows in SQL

前端 未结 3 926
被撕碎了的回忆
被撕碎了的回忆 2021-02-14 23:24

Suppose I am storing events associated with users in a table as follows (with dt standing in for the timestamp of the event):



        
相关标签:
3条回答
  • 2021-02-15 00:01

    I'm not at a computer to write code for this answer, but here's how I would go about a RegEx-based solution in SQL Server:

    1. Build a string from the resultset. Something like http://blog.sqlauthority.com/2009/11/25/sql-server-comma-separated-values-csv-from-table-column/ should work if you omit the comma
    2. Run your RegEx match against the resulting string. Unfortunately, SQL Server does not provide this functionality natively, however, you can use a CLR function for this purpose as described at http://www.ideaexcursion.com/2009/08/18/sql-server-regular-expression-clr-udf/

    This should ultimately provide you with the functionality in SQL Server that your original question requests, however, if you're analyzing a very large dataset, this could be quite slow and there may be better ways to accomplish what you're looking for.

    0 讨论(0)
  • 2021-02-15 00:02

    With Postgres 9.x this is actually quite easy:

    select userid, 
           string_agg(event, '' order by dt) as event_sequence
    from events
    group by userid;
    

    Using that result you can now apply a regular expression on the event_sequence:

    select * 
    from (
      select userid, 
             string_agg(event, '' order by dt) as event_sequence
      from events
      group by userid
    ) t
    where event_sequence ~ 'A.*B'
    

    With Postgres 8.x you need to find a replacement for the string_agg() function (just google for it, there are a lot of examples out there) and you need a sub-select to ensure the ordering of the aggregate as 8.x does support an order by in an aggregate function.

    0 讨论(0)
  • 2021-02-15 00:15

    For Oracle (version 11g R2):

    By chance if you are using Oracle DB 11g R2, take look at listagg. The below code should work, but I haven't tested. The point is: you can use listagg.

    SQL> select user,
      2         listagg( event, '' ) 
      3         within group (order by dt) events
      4     from users
      5    group by user
      6    order by dt
      7   /
    
         USER   EVENTS
    ---------  --------------------
    1          ADBCB
    2          BBAAC
    

    In prior versions you can do with CONNECT BY clause. More details on listagg.

    0 讨论(0)
提交回复
热议问题