Slow query on “UNION ALL” view

后端 未结 8 1906
伪装坚强ぢ
伪装坚强ぢ 2021-02-13 06:20

I have a DB view which basically consists of two SELECT queries with UNION ALL, like this:

CREATE VIEW v AS
SELECT time, etc. FROM t1 /         


        
相关标签:
8条回答
  • 2021-02-13 07:00

    I do not know Postgres, but some RMDBs handle comparison operators worse than BETWEEN in case of indexes. I would make an attempt using BETWEEN.

    SELECT ... FROM v WHERE time BETWEEN ... AND ...
    
    0 讨论(0)
  • 2021-02-13 07:01

    Try creating your view using UNION DISTINCT instead of UNION ALL. See if it gives wrong results. See if it gives faster performance.

    If it gives wrong results, try and map your SQL operations on tables back to relational operations on relations. The elements of relations are always distinct. There may be somthing fundamentally wrong with your model.

    I am deeply suspicious of the LEFT JOINS in the query plan you showed. It shouldn't be necessary to perform LEFT JOINS in order to get the results you appear to be selecting.

    0 讨论(0)
  • 2021-02-13 07:01

    I think i don't have much points to post it as comments so i am posting it as an answer

    I don't know how PostgreSQL works behind the scene, i think you may get a clue if it would have been Oracle, so it is here how Oracle would work

    Your UNION ALL view is slower because, behind the scene, records from both SELECT #1 and #2 are combined in a temporary table first, which is created on the fly, and then your SELECT ... FROM v WHERE time >= ... AND time < ... is executed on this temporary table. Since both #1 and #2 are indexed so they are working faster individually as expected, but this temporary table is not indexed (of course) and the final records are being selected from this temporary table so resulting in a slower response.

    Now, at least, i don't see any way to have it faster + view + non-materialized

    One way, other than running SELECT #1 and #2 and UNION them explicitly, to make it faster would be to use a stored procedure or a function in your application programming language (if it is the case), and in this procedure you make separate calls to each indexed table and then combine results, which is not as simple as SELECT ... FROM v WHERE time >= ... AND time < ... :(

    0 讨论(0)
  • 2021-02-13 07:04

    Combine the two tables. Add a column to indicate original table. If necessary, replace the original table names with views that select just the relevant part. Problem solved!

    Looking into the superclass/subclass db design pattern could be of use to you.

    0 讨论(0)
  • 2021-02-13 07:06

    A possibility would be to issue a new SQL dynamically at each call instead of creating a view and to integrate the where clause in each SELECT of the union query

    SELECT time, etc. FROM t1
        WHERE time >= ... AND time < ...
    UNION ALL
    SELECT time, etc. FROM t2
        WHERE time >= ... AND time < ...
    

    EDIT:

    Can you use a parametrized function?

    CREATE OR REPLACE FUNCTION CallMyView(t1 date, t2 date)
    RETURNS TABLE(d date, etc.)
    AS $$
        BEGIN
            RETURN QUERY
                SELECT time, etc. FROM t1
                    WHERE time >= t1 AND time < t2
                UNION ALL
                SELECT time, etc. FROM t2
                    WHERE time >= t1 AND time < t2;
        END;
    $$ LANGUAGE plpgsql;
    

    Call

    SELECT * FROM CallMyView(..., ...);
    
    0 讨论(0)
  • 2021-02-13 07:11

    I believe your query is being executed similar to:

    (
       ( SELECT time, etc. FROM t1 // #1... )
       UNION ALL
       ( SELECT time, etc. FROM t2 // #2... )
    )
    WHERE time >= ... AND time < ...
    

    which the optimizer is having difficulty optimizing. i.e. it's doing the UNION ALL first before applying the WHERE clause but, you wish it to apply the WHERE clause before the UNION ALL.

    Couldn't you put your WHERE clause in the CREATE VIEW?

    CREATE VIEW v AS
    ( SELECT time, etc. FROM t1  WHERE time >= ... AND time < ... )
    UNION ALL
    ( SELECT time, etc. FROM t2  WHERE time >= ... AND time < ... )
    

    Alternatively if the view cannot have the WHERE clause, then, perhaps you can keep to the two views and do the UNION ALL with the WHERE clause when you need them:

    CREATE VIEW v1 AS
    SELECT time, etc. FROM t1 // #1...
    
    CREATE VIEW v2 AS
    SELECT time, etc. FROM t2 // #2...
    
    ( SELECT * FROM v1 WHERE time >= ... AND time < ... )
    UNION ALL
    ( SELECT * FROM v2 WHERE time >= ... AND time < ... )
    
    0 讨论(0)
提交回复
热议问题