What is the difference between LATERAL and a subquery in PostgreSQL?

后端 未结 4 1106
花落未央
花落未央 2020-11-21 08:58

Since Postgres came out with the ability to do LATERAL joins, I\'ve been reading up on it, since I currently do complex data dumps for my team with lots of inef

4条回答
  •  盖世英雄少女心
    2020-11-21 09:07

    What is a LATERAL join?

    The feature was introduced with PostgreSQL 9.3.
    Quoting the manual:

    Subqueries appearing in FROM can be preceded by the key word LATERAL. This allows them to reference columns provided by preceding FROM items. (Without LATERAL, each subquery is evaluated independently and so cannot cross-reference any other FROM item.)

    Table functions appearing in FROM can also be preceded by the key word LATERAL, but for functions the key word is optional; the function's arguments can contain references to columns provided by preceding FROM items in any case.

    Basic code examples are given there.

    More like a correlated subquery

    A LATERAL join is more like a correlated subquery, not a plain subquery, in that expressions to the right of a LATERAL join are evaluated once for each row left of it - just like a correlated subquery - while a plain subquery (table expression) is evaluated once only. (The query planner has ways to optimize performance for either, though.)
    Related answer with code examples for both side by side, solving the same problem:

    • Optimize GROUP BY query to retrieve latest row per user

    For returning more than one column, a LATERAL join is typically simpler, cleaner and faster.
    Also, remember that the equivalent of a correlated subquery is LEFT JOIN LATERAL ... ON true:

    • Call a set-returning function with an array argument multiple times

    Things a subquery can't do

    There are things that a LATERAL join can do, but a (correlated) subquery cannot (easily). A correlated subquery can only return a single value, not multiple columns and not multiple rows - with the exception of bare function calls (which multiply result rows if they return multiple rows). But even certain set‑returning functions are only allowed in the FROM clause. Like unnest() with multiple parameters in Postgres 9.4 or later. The manual:

    This is only allowed in the FROM clause;

    So this works, but cannot (easily) be replaced with a subquery:

    CREATE TABLE tbl (a1 int[], a2 int[]);
    SELECT * FROM tbl, unnest(a1, a2) u(elem1, elem2);  -- implicit LATERAL
    

    The comma (,) in the FROM clause is short notation for CROSS JOIN.
    LATERAL is assumed automatically for table functions.
    About the special case of UNNEST( array_expression [, ... ] ):

    • How do you declare a set-returning-function to only be allowed in the FROM clause?

    Set-returning functions in the SELECT list

    You can also use set-returning functions like unnest() in the SELECT list directly. This used to exhibit surprising behavior with more than one such function in the same SELECT list up to Postgres 9.6. But it has finally been sanitized with Postgres 10 and is a valid alternative now (even if not standard SQL). See:

    • What is the expected behaviour for multiple set-returning functions in SELECT clause?

    Building on above example:

    SELECT *, unnest(a1) AS elem1, unnest(a2) AS elem2
    FROM   tbl;
    

    Comparison:

    dbfiddle for pg 9.6 here
    dbfiddle for pg 10 here

    Clarify misinformation

    The manual:

    For the INNER and OUTER join types, a join condition must be specified, namely exactly one of NATURAL, ON join_condition, or USING (join_column [, ...]). See below for the meaning.
    For CROSS JOIN, none of these clauses can appear.

    So these two queries are valid (even if not particularly useful):

    SELECT *
    FROM   tbl t
    LEFT   JOIN LATERAL (SELECT * FROM b WHERE b.t_id = t.t_id) t ON TRUE;
    
    SELECT *
    FROM   tbl t, LATERAL (SELECT * FROM b WHERE b.t_id = t.t_id) t;
    

    While this one is not:

    SELECT *
    FROM   tbl t
    LEFT   JOIN LATERAL (SELECT * FROM b WHERE b.t_id = t.t_id) t;

    That's why Andomar's code example is correct (the CROSS JOIN does not require a join condition) and Attila's is was not.

提交回复
热议问题