Postgres: Why is the performance so bad on subselects with Offset/Limit

后端 未结 2 1650
梦谈多话
梦谈多话 2021-01-20 06:38

Can you please help me understand the reason for the performance drop between these statements?

For me it seems like in case of D & E he is first joining the add

2条回答
  •  星月不相逢
    2021-01-20 06:48

    This seems to perform reasonable for the ranks={1,2} case. (CTE's were terrible, FYI)

    -- EXPLAIN ANALYZE
    SELECT s.user_id
            , MAX (CASE WHEN a0.rn = 1 THEN a0.address_id ELSE NULL END) AS ad1
            , MAX (CASE WHEN a0.rn = 2 THEN a0.address_id ELSE NULL END) AS ad2
    FROM subscribers s
    JOIN (  SELECT user_id, address_id
            , row_number() OVER(PARTITION BY user_id ORDER BY address_id) AS rn
            FROM address
            )a0 ON a0.user_id = s.user_id AND a0.rn <= 2
    GROUP BY s.user_id
    ORDER BY s.user_id
    OFFSET 10000 LIMIT 200
            ;
    

    UPDATE: the query below seems to perform slightly better:

        -- ----------------------------------
    -- EXPLAIN ANALYZE
    SELECT s.user_id
            , MAX (CASE WHEN a0.rn = 1 THEN a0.address_id ELSE NULL END) AS ad1
            , MAX (CASE WHEN a0.rn = 2 THEN a0.address_id ELSE NULL END) AS ad2
    FROM ( SELECT user_id
            FROM subscribers
            ORDER BY user_id
            OFFSET 10000
            LIMIT 200
            ) s 
    JOIN (     SELECT user_id, address_id
            , row_number() OVER(PARTITION BY user_id ORDER BY address_id) AS rn
            FROM address
            ) a0 ON a0.user_id = s.user_id AND a0.rn <= 2
    GROUP BY s.user_id
    ORDER BY s.user_id
            ;
    

    Note: in both the JOINS should probably LEFT JOINs, to allow for the 1st and 2nd address to be missing.


    UPDATE: combining the subsetting subquery (like in @David Aldridfge 's answer) with the original (two scalar subqueries)

    Self-joining the subscribers table with itself allows indexes to be used for the scalar subqueries, without the need to throw away the first 100K result-rows.

    -- EXPLAIN ANALYZE
    SELECT s.user_id
    , (SELECT address_id
            FROM address a
            WHERE a.user_id = s.user_id
            ORDER BY address_id OFFSET 0 LIMIT 1
            ) AS a_id1
    , (SELECT address_id
            FROM address a
            WHERE a.user_id = s.user_id
            ORDER BY address_id OFFSET 1 LIMIT 1
            ) AS a_id2
    FROM subscribers s
    JOIN (
            SELECT user_id
            FROM subscribers
            ORDER BY user_id
            OFFSET 10000 LIMIT 200
            ) x ON x.user_id = s.user_id
    ORDER BY s.user_id
            ;
    

提交回复
热议问题