In Postgresql, if you unnest two arrays of the same size, they line up each value from one array with one from the other, but if the two arrays are not the same size, it joins e
Support for set-returning functions in SELECT
is a PostgreSQL extension, and an IMO very weird one. It's broadly considered deprecated and best avoided where possible.
SELECT
where possibleNow that LATERAL
is supported in 9.3, one of the two main uses is gone. It used to be necessary to use a set-returning function in SELECT
if you wanted to use the output of one SRF as the input to another; that is no longer needed with LATERAL
.
The other use will be replaced in 9.4, when WITH ORDINALITY
is added, allowing you to preserve the output ordering of a set-returning function. That's currently the main remaining use: to do things like zip the output of two SRFs into a rowset of matched value pairs. WITH ORDINALITY
is most anticipated for unnest
, but works with any other SRF.
The logic that PostgreSQL is using here (for whatever IMO insane reason it was originally introduced in ancient history) is: whenever either function produces output, emit a row. If only one function has produced output, scan the other one's output again to get the rows required. If neither produces output, stop emitting rows.
It's easier to see with generate_series
.
regress=> SELECT generate_series(1,2), generate_series(1,2);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
(2 rows)
regress=> SELECT generate_series(1,2), generate_series(1,3);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
1 | 3
2 | 1
1 | 2
2 | 3
(6 rows)
regress=> SELECT generate_series(1,2), generate_series(1,4);
generate_series | generate_series
-----------------+-----------------
1 | 1
2 | 2
1 | 3
2 | 4
(4 rows)
In the majority of cases what you really want is a simple cross join of the two, which is a lot saner.
regress=> SELECT a, b FROM generate_series(1,2) a, generate_series(1,2) b;
a | b
---+---
1 | 1
1 | 2
2 | 1
2 | 2
(4 rows)
regress=> SELECT a, b FROM generate_series(1,2) a, generate_series(1,3) b;
a | b
---+---
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
(6 rows)
regress=> SELECT a, b FROM generate_series(1,2) a, generate_series(1,4) b;
a | b
---+---
1 | 1
1 | 2
1 | 3
1 | 4
2 | 1
2 | 2
2 | 3
2 | 4
(8 rows)
The main exception is currently for when you want to run multiple functions in lock-step, pairwise (like a zip
), which you cannot currently do with joins.
WITH ORDINALITY
This will be improved in 9.4 with WITH ORDINALITY
, a d while it'll be a bit less efficient than a multiple SRF scan in SELECT (unless optimizer improvements are added) it'll be a lot saner.
Say you wanted to pair up 1..3
and 10..40
with nulls for excess elements. Using with ordinality
that'd be (PostgreSQL 9.4 only):
regress=# SELECT aval, bval
FROM generate_series(1,3) WITH ORDINALITY a(aval,apos)
RIGHT OUTER JOIN generate_series(1,4) WITH ORDINALITY b(bval, bpos)
ON (apos=bpos);
aval | bval
------+------
1 | 1
2 | 2
3 | 3
| 4
(4 rows)
wheras the srf-in-from would instead return:
regress=# SELECT generate_series(1,3) aval, generate_series(1,4) bval;
aval | bval
------+------
1 | 1
2 | 2
3 | 3
1 | 4
2 | 1
3 | 2
1 | 3
2 | 4
3 | 1
1 | 2
2 | 3
3 | 4
(12 rows)