How can I select adjacent rows to an arbitrary row (in sql or postgresql)?

后端 未结 5 1843
猫巷女王i
猫巷女王i 2021-02-18 17:16

I want to select some rows based on certain criteria, and then take one entry from that set and the 5 rows before it and after it.

Now, I can do this numerically if ther

相关标签:
5条回答
  • 2021-02-18 17:27

    Here's another way to do it with analytic functions lead and lag. It would be nice if we could use analytic functions in the WHERE clause. So instead you need to use subqueries or CTE's. Here's an example that will work with the pagila sample database.

    WITH base AS (
        SELECT lag(customer_id, 5) OVER (ORDER BY customer_id) lag, 
          lead(customer_id, 5) OVER (ORDER BY customer_id) lead, 
          c.*
        FROM customer c
        WHERE c.active = 1
        AND c.last_name LIKE 'B%'
    ) 
    SELECT base.* FROM base 
    JOIN (
      -- Select the center row, coalesce so it still works if there aren't 
      -- 5 rows in front or behind
      SELECT COALESCE(lag, 0) AS lag, COALESCE(lead, 99999) AS lead 
      FROM base WHERE customer_id = 280
    ) sub ON base.customer_id BETWEEN sub.lag AND sub.lead
    

    The problem with sgriffinusa's solution is that you don't know which row_number your center row will end up being. He assumed it will be row 30.

    0 讨论(0)
  • 2021-02-18 17:28

    For similar query I use analytic functions without CTE. Something like:

    select ..., LEAD(gm.id) OVER (ORDER BY Cit DESC) as leadId, LEAD(gm.id, 2) OVER (ORDER BY Cit DESC) as leadId2, LAG(gm.id) OVER (ORDER BY Cit DESC) as lagId, LAG(gm.id, 2) OVER (ORDER BY Cit DESC) as lagId2 ... where id = 25912 or leadId = 25912 or leadId2 = 25912 or lagId = 25912 or lagId2 = 25912

    such query works more faster for me than CTE with join (answer from Scott Bailey). But of course less elegant

    0 讨论(0)
  • 2021-02-18 17:34

    You could do this utilizing row_number() (available as of 8.4). This may not be the correct syntax (not familiar with postgresql), but hopefully the idea will be illustrated:

    SELECT *
    FROM (SELECT ROW_NUMBER() OVER (ORDER BY primary_key) AS r, *
          FROM table
          WHERE active=1) t
    WHERE 25 < r and r < 35
    

    This will generate a first column having sequential numbers. You can use this to identify the single row and the rows above and below it.

    0 讨论(0)
  • 2021-02-18 17:35

    There's a lot of ways to do it if you run two queries with a programming language, but here's one way to do it in one SQL query:

    (SELECT * FROM table WHERE id >= 34 AND active = 1 ORDER BY id ASC LIMIT 6)
    UNION
    (SELECT * FROM table WHERE id < 34 AND active = 1 ORDER BY id DESC LIMIT 5)
    ORDER BY id ASC
    

    This would return the 5 rows above, the target row, and 5 rows below.

    0 讨论(0)
  • 2021-02-18 17:53

    If you wanted to do it in a 'relationally pure' way, you could write a query that sorted and numbered the rows. Like:

    select (
      select count(*) from employees b
      where b.name < a.name
    ) as idx, name
    from employees a
    order by name
    

    Then use that as a common table expression. Write a select which filters it down to the rows you're interested in, then join it back onto itself using a criterion that the index of the right-hand copy of the table is no more than k larger or smaller than the index of the row on the left. Project over just the rows on the right. Like:

    with numbered_emps as (
      select (
        select count(*)
        from employees b
        where b.name < a.name
      ) as idx, name
      from employees a
      order by name
    )
    select b.*
    from numbered_emps a, numbered_emps b
    where a.name like '% Smith' -- this is your main selection criterion
    and ((b.idx - a.idx) between -5 and 5) -- this is your adjacency fuzzy-join criterion
    

    What could be simpler!

    I'd imagine the row-number based solutions will be faster, though.

    0 讨论(0)
提交回复
热议问题