I have data that is arranged in a ring structure (or circular buffer), that is it can be expressed as sequences that cycle: ...-1-2-3-4-5-1-2-3-.... See this picture to get an idea of a 5-part ring:
I'd like to create a window query that can combine the lag and lead items into a three point array, but I can't figure it out. For example at part 1 of a 5-part ring, the lag/lead sequence is 5-1-2, or at part 4 is 3-4-5.
Here is an example table of two rings with different numbers of parts (always more than three per ring):
create table rp (ring int, part int);
insert into rp(ring, part) values(1, generate_series(1, 5));
insert into rp(ring, part) values(2, generate_series(1, 7));
Here is a nearly successful query:
SELECT ring, part, array[
lag(part, 1, NULL) over (partition by ring),
part,
lead(part, 1, 1) over (partition by ring)
] AS neighbours
FROM rp;
ring | part | neighbours
------+------+------------
1 | 1 | {NULL,1,2}
1 | 2 | {1,2,3}
1 | 3 | {2,3,4}
1 | 4 | {3,4,5}
1 | 5 | {4,5,1}
2 | 1 | {NULL,1,2}
2 | 2 | {1,2,3}
2 | 3 | {2,3,4}
2 | 4 | {3,4,5}
2 | 5 | {4,5,6}
2 | 6 | {5,6,7}
2 | 7 | {6,7,1}
(12 rows)
The only thing I need to do is to replace the NULL
with the ending point of each ring, which is the last value. Now, along with lag
and lead
window functions, there is a last_value
function which would be ideal. However, these cannot be nested:
SELECT ring, part, array[
lag(part, 1, last_value(part) over (partition by ring)) over (partition by ring),
part,
lead(part, 1, 1) over (partition by ring)
] AS neighbours
FROM rp;
ERROR: window function calls cannot be nested
LINE 2: lag(part, 1, last_value(part) over (partition by ring)) ...
Update. Thanks to @Justin's suggestion to use coalesce
to avoid nesting window functions. Furthermore, it has been pointed out by numerous folks that first/last values need an explicit order by
on the ring sequence, which happens to be part
for this example. So randomising the input data a bit:
create table rp (ring int, part int);
insert into rp(ring, part) select 1, generate_series(1, 5) order by random();
insert into rp(ring, part) select 2, generate_series(1, 7) order by random();
- Use
COALESCE
like @Justin provided. With
first_value()
/last_value()
you need to add anORDER BY
clause to the window definition or the order is undefined. You just got lucky in the example, because the rows happen to be in order right after creating the dummy table.
Once you addORDER BY
, the default window frame ends at the current row, and you need to special case thelast_value()
call - or revert the sort order in the window frame like demonstrated in my first example.When reusing a window definition multiple times, an explicit
WINDOW
clause simplifies syntax a lot:
SELECT ring, part, ARRAY[
coalesce(
lag(part) OVER w
,first_value(part) OVER (PARTITION BY ring ORDER BY part DESC))
,part
,coalesce(
lead(part) OVER w
,first_value(part) OVER w)
] AS neighbours
FROM rp
WINDOW w AS (PARTITION BY ring ORDER BY part);
Better yet, reuse the same window definition, so Postgres can calculate all values in a single scan. For this to work we need to define a custom window frame:
SELECT ring, part, ARRAY[
coalesce(
lag(part) OVER w
,last_value(part) OVER w)
,part
,coalesce(
lead(part) OVER w
,first_value(part) OVER w)
] AS neighbours
FROM rp
WINDOW w AS (PARTITION BY ring
ORDER BY part
RANGE BETWEEN UNBOUNDED PRECEDING
AND UNBOUNDED FOLLOWING)
ORDER BY 1,2;
You can even adapt the frame definition for each window function call:
SELECT ring, part, ARRAY[
coalesce(
lag(part) OVER w
,last_value(part) OVER (w RANGE BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING))
,part
,coalesce(
lead(part) OVER w
,first_value(part) OVER w)
] AS neighbours
FROM rp
WINDOW w AS (PARTITION BY ring ORDER BY part)
ORDER BY 1,2;
Might be faster for rings with many parts. You'll have to test.
SQL Fiddle demonstrating all three with an improved test case. Consider query plans.
More about window frame definitions:
Query:
SELECT ring, part, array[
coalesce(lag(part, 1, NULL) over (partition by ring),
max(part) over (partition by ring)),
part,
lead(part, 1, 1) over (partition by ring)
] AS neighbours
FROM rp;
Result:
| RING | PART | NEIGHBOURS |
|------|------|------------|
| 1 | 1 | 5,1,2 |
| 1 | 2 | 1,2,3 |
| 1 | 3 | 2,3,4 |
| 1 | 4 | 3,4,5 |
| 1 | 5 | 4,5,1 |
| 2 | 1 | 7,1,2 |
| 2 | 2 | 1,2,3 |
| 2 | 3 | 2,3,4 |
| 2 | 4 | 3,4,5 |
| 2 | 5 | 4,5,6 |
| 2 | 6 | 5,6,7 |
| 2 | 7 | 6,7,1 |
来源:https://stackoverflow.com/questions/25030179/how-to-use-a-ring-data-structure-in-window-functions